47
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      ABO Blood Groups and Cardiovascular Diseases

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          ABO blood groups have been associated with various disease phenotypes, particularly cardiovascular diseases. Cardiovascular diseases are the most common causes of death in developed countries and their prevalence rate is rapidly growing in developing countries. There have been substantial historical associations between non-O blood group status and an increase in some cardiovascular disorders. Recent GWASs have identified ABO as a locus for thrombosis, myocardial infarction, and multiple cardiovascular risk biomarkers, refocusing attention on mechanisms and potential for clinical advances. As we highlight in this paper, more recent work is beginning to probe the molecular basis of the disease associations observed in these observational studies. Advances in our understanding of the physiologic importance of various endothelial and platelet-derived circulating glycoproteins are elucidating the mechanisms through which the ABO blood group may determine overall cardiovascular disease risk. The role of blood group antigens in the pathogenesis of various cardiovascular disorders remains a fascinating subject with potential to lead to novel therapeutics and prognostics and to reduce the global burden of cardiovascular diseases.

          Related collections

          Most cited references95

          • Record: found
          • Abstract: found
          • Article: not found

          Circulating adhesion molecules VCAM-1, ICAM-1, and E-selectin in carotid atherosclerosis and incident coronary heart disease cases: the Atherosclerosis Risk In Communities (ARIC) study.

          Recruitment of circulating leukocytes at sites of atherosclerosis is mediated through a family of adhesion molecules. The function of circulating forms of these adhesion molecules remains unknown, but their levels may serve as molecular markers of subclinical coronary heart disease (CHD). To determine the ability of circulating vascular cell adhesion molecule-1 (VCAM-1), endothelial-leukocyte adhesion molecule-1 (E-selectin), and intercellular adhesion molecule-1 (ICAM-1) to serve as molecular markers of atherosclerosis and predictors of incident CHD, we studied 204 patients with incident CHD, 272 patients with carotid artery atherosclerosis (CAA), and 316 control subjects from the large, biracial Atherosclerosis Risk In Communities (ARIC) study. Levels of VCAM-1 were not significantly different among the patients with incident CHD, those with CAA, and control subjects. Higher levels of E-selectin and ICAM-1 were observed for the patients with CHD (means [ng/mL]: E-selectin, 38.4; ICAM-1, 288.7) and those with CAA (E-selectin, 41.5; ICAM-1, 283.6) compared with the control subjects (E-selectin, 32.8; ICAM-1, 244.2), but the distributions were not notably different between the patients with CHD and CAA. Results of logistic regression analyses indicated that the relationship of ICAM-1 and E-selectin with CHD and CAA was independent of other known CHD risk factors and was most pronounced in the highest quartile. The odds of CHD and CAA were 5.53 (95% CI, 2.51-12.21) and 2.64 (95% CI, 1.40-5.01), respectively, for those with levels of ICAM-1 in the highest quartile compared with those in the lowest quartile. Odds of CAA were 2.03 (95% CI, 1.14-3.62) for those with levels of E-selectin in the highest quartile compared with those in the lowest quartile. These data indicate that plasma levels of ICAM-1 and E-selectin may serve as molecular markers for atherosclerosis and the development of CHD.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Forty-Three Loci Associated with Plasma Lipoprotein Size, Concentration, and Cholesterol Content in Genome-Wide Analysis

            Introduction Standard measures of plasma lipoprotein concentration do not reveal heterogeneity in the size of lipoprotein particles or their content of cholesterol and triglycerides. Yet recognizing this heterogeneity may be essential for understanding qualitative differences in lipid metabolism among individuals. Some reports identify a pattern in the size distribution of lipoprotein sub-fractions as intimately connected with coronary heart disease [1],[2]. Related findings identify a link between lipoprotein profile and metabolic syndrome, and by inference to diabetes [3]. While these observations remain controversial for prognostic use [4], they point to alterations in lipoprotein metabolism in disease. The variation in particle size and lipid content can be quantified accurately by NMR-based methods that determine lipoprotein particle concentration according to lipid class and particle size. Thus, NMR methods can measure concentration of large and small low density lipoprotein (LDL) particles as well as concentration of the related intermediate density lipoprotein (IDL) particles, and similarly concentration of small, medium, and large high density lipoprotein (HDL) or very low density lipoprotein (VLDL) particles. HDL and LDL particle concentration can also be estimated by chemical measures of apolipoprotein A1 (ApoA1) and apolipoprotein B (ApoB) protein concentration, respectively, but neither these assays nor other standard clinical assays provide information about particle size distribution, and consequently the apportionment of cholesterol and triglycerides to different sized particles. The greater precision in characterizing lipoprotein profiles using NMR-based techniques provides an opportunity for correspondingly greater detail in understanding lipid metabolism, for example by genome-wide genetic analysis, as has been done recently for plasma concentration LDL-C, HDL-C, triglycerides, ApoA1, and ApoB [5]–[13]. Results Genome-wide association analysis of 22 NMR-based and conventional lipoprotein fractions Among 17,296 WGHS participants with confirmed European ancestry (Table 1), we performed genome-wide association analysis assuming an additive genetic model for 22 plasma lipoprotein measures determined either by NMR methods or by standard clinical assay. On the basis of genome-wide significance (P 150kb) from known genic regions. Among the standard clinical measures LDL-C, HDL-C, and triglycerides only, novel genome-wide loci were found at KLF14 (7q32.2) and CCDC9/DNAH10/ZNF664 (12q24.31.B), both for triglycerides. The association at the novel locus 8p23.1 (which differentiated the fasting sample from the whole sample on the basis of mean VLDL particle size) is over 1.8 Mb from a recently described association at 8p23.1 between SNP rs7819412 and triglycerides [6]. The remaining 24 unique loci suggested genes recognized for a diversity of roles in lipid metabolism, broadly defined (Figure S1). Thus, SNPs with genome-wide significance, were confirmed in or near PCSK9 (at 1p32.3), APOA2 (1q23.3), APOB (2p24.1), ABCG5/8 (2p21), HMGCR (5q13.3), LPL (8p21.3), APOA1-A5 (11q23.3), ABCA1 (9q31.1), FADS1-3 (11q12.2), LIPC (15q22.1), CETP (16q13), LIPG (18q21.1), LDLR (19p13.2), the APOC-APOE complex (19q13.32), and PLTP (20q13.12). Similarly, association at 9q34.2 implicating the ABO gene recapitulates and extends the known association between blood group antigen and total cholesterol [14],[15]. Less well characterized genic regions, which nonetheless have been validated recently for roles in lipid metabolism, were confirmed for ANGPTL3 (1p31.3), CELSR2/MYBPHL/PSRC1/SORT1 (1p13.3), GCKR (2p23.3), MLXIPL (7q11.23), and TRIB1 (8q24.13), HNF1A (12q24.31.A), and HNF4A (20q13.12). The association at COBLL1/GRB14 (2q24.3) with HDL-C was recently described elsewhere in this same cohort and validated by replication [16]. The previous study found much stronger association in women than men, suggesting a potential interaction with gender. At this locus, the gene GRB14 is thought to inhibit receptors in the insulin receptor class [17],[18]. The current analysis extends associations at this locus to concentrations of LDL, HDL, and VLDL particles according to size (Table S1). Consistent with a high degree of correlation among the lipoprotein measures (Table S2), the rank order by p-value among the highly significant SNPs was similar for each measure with at least one genome-wide significant association (Figure S1). A notable exception was the APOB gene (2p24.1), where the ordering of the p-values, conditional analysis, and patterns of linkage disequlibrium (LD) among the top SNPs (Table S1) revealed three classes of associations. One class included VLDL-related fractions, triglycerides, and mean LDL size for which either rs673548 or rs676210 (LD r2 = 1.0) had the strongest association; a second class included ApoB, large LDL particles, and total LDL particles for which either rs1713222 or rs506585 (LD r2 = 0.5) had the strongest association; and a final class including only LDL-C for which rs137117 was most strongly associated (Figure 1A). Between SNPs in different classes, maximum LD ranged from r2 = 0.04–0.11. Similarly, at APOA5-APOA1 (11q23.3), p-values revealed two classes of associations seemingly segregating between effects nearer the APOA5 gene involving triglycerides and effects nearer the APOA1 gene involving HDL related lipoprotein fractions (Figure 1B). 10.1371/journal.pgen.1000730.g001 Figure 1 Loci with distinct classes of SNP associations among lipoprotein fractions with genome-wide significance. (A) APOB locus (2p24.1), (B) APOA1-A5 locus (11q23.3). Recombination rates are from [41]. Large, well-characterized cohorts with NMR-based measurement of lipoprotein fractions are scant, but sub-samples of about 2700 participants in the Framingham Heart Study Offspring cohort (FHS) [19] and about 2000 total CHD cases and controls from PROCARDIS [20] had both the NMR-based lipoprotein measures and genome-wide genetic data already determined. Among all candidate loci, concordance of direction of effects was observed respectively at 124 out of 146 (84%) [84% in fasting sub-sample] and 125 out of 133 (94%) [99% in fasting sub-sample] of the candidate associations for which there was genotype information in FHS and PROCARDIS (Table S3 [whole WGHS sample candidates], Table S4 [fasting WGHS subsample candidates]). For each of the previously known loci except ABCA1 (9q31), at least one of the candidate associations was nominally significant (P 0.05; Table S3). However, a recent genome-wide meta-analysis of LDL-C, HDL-C, and triglycerides found significant, but not genome-wide significant, associations among these fractions with candidate SNPs from the WGHS at PCCB/STAG1 (3q22.3), BTNL2 (6p21.32), KLF14 (7q32.2), and 8p23.1 [10], although the significant SNP associations at PCCB/STAG1 (3q22.3) and BTNL2 (6p21.32) were not fully concordant between the two studies (Table 3). Independent evidence for functional consequence of the candidate SNP (rs10778213) at 12q23.2 is its genome-wide significant association in a smaller sample from the WGHS with plasma C-reactive protein (CRP), a biomarker of inflammation that is slightly correlated if at all with the two HDL measures associated at this locus (total HDL particle concentration [HDL:T], Spearman r = 0.22; HDL cholesterol estimated by the NMR [HDL:N], Spearman r = −0.04) [21]. With the larger sample of WGHS genotype information in the current study, the association with plasma CRP is more significant (P 0.05). e Genome-wide significant association with plasma C-reactive protein in the WGHS [21]. Magnitudes of genetic effects To assess the contribution of common genetic variation at each of the candidate loci to each of the adjusted lipoprotein fractions, we constructed regression models by stepwise selection of SNPs in the vicinity of the primary genome-wide significant associations. Most of these models explain less than 1% of the variation in the adjusted lipoprotein fractions (Figure 2, Table S5, and Table S6). The top three effects, all at APOC-APOE complex (19q13.32), explain 8.9%, 8.4%, and 7.1% of the variance in ApoB particle concentration, the related total LDL particle concentration, and LDL-C, respectively. Fasting status had an influence on retention of SNPs in the model selection procedure, but only for loci with modest effects (Compare Table S5 and Table S6). There were no genetic contributions remaining from the model selection procedure for any of LDL-C, HDL-C, triglycerides, ApoA1, or ApoB concentration at APOA2 (1q23.3) in the whole sample and at WIPI1 (17q24.2) in the fasting subsample, suggesting that these loci would not have been identified for genome-wide association with the five conventional lipoprotein fractions even in a much larger sample with the genome-wide SNP genotyping panel used in this study. Clustering loci on the basis of the profile of associated lipoprotein fractions suggests sub-groups of loci with related patterns of effects (Figure S2, Figure S3), perhaps suggesting distinct but possibly overlapping biological pathways for lipoprotein metabolism. For example, HNF1A, LDLR, ABCG5/8, PCSK9, and CELSR2/PSRC1/SARS/SORT1 largely share associations with IDL, small VLDL, total VLDL large LDL, LDL-C, total LDL, and ApoB. 10.1371/journal.pgen.1000730.g002 Figure 2 Variance explained in adjusted lipoprotein measures by common variation at the candidate loci by SNPs retained in model selection procedures. See also Figure S2 and Figure S3. The total genetic effects for each lipoprotein determined by summing over the effects at all loci ranged from 2.1% for mean VLDL size to 17.2% for ApoB (Table 4). The effects were not substantially different when the entire model selection procedure was performed in the fasting subsample (Table 4), and only slightly smaller in general among the unadjusted lipoprotein fractions (Table S7). Notably, the common genetic variation in this study at the genome-wide loci had a greater total effect on mean particle size than on standard clinical cholesterol measures for HDL but not for LDL or VLDL (Table 4). 10.1371/journal.pgen.1000730.t004 Table 4 Proportion (%) variance in fully adjusted lipoprotein fractions explained by common variation at candidate loci. lipoprotein fraction whole sample fasting subsample LDL large 12.0 11.4 LDL small 8.9 9.4 LDL mean size 8.5 8.7 IDL total 3.5 3.5 LDL total 15.2 15.0 LDL-C assay 13.7 13.8 ApoB assay 17.2 16.8 HDL total 5.6 5.6 HDL large 13.1 12.5 HDL medium 4.6 4.4 HDL small 6.4 5.7 HDL mean size 12.2 11.7 HDL-C by NMR 10.3 9.9 HDL-C assay 9.9 9.1 ApoA1 assay 8.3 7.8 VLDL total 8.9 8.6 VLDL large 3.8 4.1 VLDL medium 6.0 6.0 VLDL small 7.6 7.4 VLDL mean size 2.1 2.5 TG by NMR 7.9 7.6 TG assay 7.7 8.1 Secondary genome-wide analysis To examine the possibility that other loci might include SNPs with genome-wide significant association conditional on effects at the primary loci, we adjusted the primary lipoprotein fraction measurements (which were already adjusted for clinical covariates) for SNPs retained by the model selection procedure at the candidate loci, and repeated the genome-wide association testing. Quantile-quantile analysis confirmed that all of the excess of extremely small p-values in the original analysis could be explained by the variation at the candidate loci (not shown). Similarly, genotype-based statistical models (as opposed to the allele-based additive models used in the primary analysis) did not reveal other loci with genetic influences at the genome-wide significance level in the whole sample. While we adjusted the lipoprotein measures with a full set of clinical characteristics to reduce variance and enhance power in the primary analysis, it remained possible that relevant SNPs would be overlooked if they acted through effects on the adjustment covariates. Similarly, subtle effects on the association estimates due to non-normality of the (possibly log-transformed) adjusted lipoprotein measures or due sub-European population stratification might confound hypothesis testing. To evaluate whether our discovery procedure was robust, we performed secondary analyses repeating the entire genome-wide discovery procedure for alternative nested subsets of clinical covariates with and without further adjustment for population structure and quantile normalization (Table S8). Comparing the full adjustment procedure to alternatives using either a reduced set of clinical covariates or age only, with or without additional adjustment for potential sub-European population stratification and quantile normalization yielded further genome-wide significant associations at three loci with known lipid metabolic genes, LPA (6q25.3), LCAT (16q22.1), and APOH (17q24.2), and two additional loci, 6p22.3 and 10q21.3. All of the additional loci were present in the age-adjusted analysis. Associations at 6p22.3 and 10q21.3 appear to be novel and implicate, respectively the GMPR or MYLIP genes and the JMJD1C gene. The lead SNPs at each of these loci were significantly associated with at least one of LDL-C, HDL-C or triglycerides in the recently published meta-analysis (Table 5) [10]. Similarly, in internal replication among the additional 4639 WGHS samples with genotype available after the main analysis was complete, associations at the candidate SNPs were all significant and the trends of effects were all consistent with effects in the discovery sample (Table 5). We note that at JMJD1C (10q21.3), the candidate SNPs have minor allele frequency near 0.5, and that available data does not allow us to determine whether the differences in the direction of the minor allele effect on VLDL fractions in the WGHS and triglycerides in the previously published replication study are truly physiological or rather that the frequency of the coded (i.e. minor) allele from the WGHS is greater than 0.5 in the replication cohort resulting in an opposite sign of the effect estimates. 10.1371/journal.pgen.1000730.t005 Table 5 Genome-wide significant associations (p 0.05). # Abbreviation as in Table 2. ∧ P-value (two-sided) for association in additional 4639 samples from the WGHS. All association trends were consistent with discovery sample. Combining these new samples with the original discovery samples leads to p–values for the extended WGHS sample. Since lipoprotein particle size is closely related to triglyceride content, we also performed secondary analysis examining genome-wide significant associations after adjustment of the lipoprotein fractions by the full set of clinical covariates and (log-transformed) triglyceride levels (Table 5 and Table S8). This analysis identified only one new genome-wide significant association. At 11p15.4, rs7938647 in the intron of the SBF2 gene was associated with full-plus-triglyceride adjusted total HDL particle concentration. Again, internal replication provided support for this association although there was no association (P>0.05) with LDL-C, HDL-C, or triglycerides in the recent meta-analysis for replication. Associations distinguishing NMR-based from conventional lipoprotein measures Among its unique characteristics, the NMR-based methodology provides information about IDL and VLDL particle concentration, both aspects of lipoprotein profiles that are difficult to measure by conventional methods. For IDL, genetic associations were observed at many of the candidate loci (Figure 2, Table 2, Table S1) and most strongly at LIPC (15q22.1), where rs1532085 had an estimated 0.11 nmol/l shift in particle concentration for each copy of the minor allele (p = 1.5×10−20). For total VLDL concentration, association with genetic variation was observed at many loci but none more strongly than at the APOC-APOE complex where rs439401, which is in perfect LD with rs7412 (the SNP that distinguishes APOE alleles E2 and E3), had an estimated −2.4nmol/l shift in concentration per copy of the minor allele (p = 2.1×10−12; Table S1). Loci strongly affecting the relative concentration of NMR-based estimates of small, medium, and large particle size could be identified on the basis of genome-wide effects on mean particle size, and these associations were of special interest when there was no accompanying association with the corresponding cholesterol measure retained in the model selection procedures (Table 6, Figure S4). For LDL, mean particle size was associated with genome-wide significance at 12 loci (Table 2), among which the model selection procedures failed to identify any association with LDL-C at MLXIPL (7q11.23), LPL (8p21.3), CCDC92/DNAH10/ZNF664 (12q24.31.B), and LIPG (18q21.1). These loci implicate genes related to glucose or triglyceride metabolism as well as unrecognized biological function at one novel locus (CCDC92/DNAH10/ZNF664 [12q24.31.B]). The associations with mean LDL particle size were a consequence of strong inverse effects on large and small LDL particles (MLXIPL [7q11.23], LPL [8p21.3], LIPG [18q21.1]) or of exclusive effects on small LDL (CCDC92/DNAH10/ZNF664 [12q24.31.B]) [see Figure S4]. In the fasting subsample, the associations with the NMR based measures at LPL (8p21.3) and LIPG (18q21.1) also met genome-wide significance, but the associations at MLXIPL (7q11.23) and CCDC92/DNAH10/ZNF664 (12q24.31.B) did not. For HDL, 9 loci had genome-wide significance for mean particle size (Table 2), among which the clinical measure of HDL-C was not associated with genetic variation only at GCKR (2p23.3), as was also found in the fasting subsample (Figure 2, Table 6). The discordant effects on LDL size and cholesterol content at LPL (8p21.3), CCDC92/DNAH10/ZNF664 (12q24.31.B), and LIPG (18q21.1) but not those of HDL size and cholesterol content were independent of triglyceride level in as much as associations persisted in analysis that further adjusted the lipoprotein fractions for (log-transformed) triglycerides, although only at nominal significance rather than genome-wide significance (Table 6). 10.1371/journal.pgen.1000730.t006 Table 6 Loci with genome-wide significant association (P 1%, successful genotyping in 90% of the subjects, and deviations from Hardy-Weinberg equilibrium not exceeding P = 10−6 in significance. A total of 335,603 unique SNPs, of which 32,521 derive from the custom content, remained in the final data. Although assays for two non-synonymous SNPs at the APOE locus (19q13.32), rs429358 and rs7412, which determine ApoE isotype, failed in the design of the Illumina custom content, genotypes for these two SNPs were determined separately by an allele-specific, PCR based method (Celera, Alameda, CA) [34]. These additional SNPs are in linkage disequilibrium with SNPs in the Illumina panel. The targeted genotypes for APOE were included during the model selection procedures but not during the primary analysis to discover loci with genome-wide significant associations. Analytic methods Primary analysis to discover loci with highly significant associations in the WGHS discovery cohort was performed by linear regression in PLINK [35] assuming an additive relationship between the number of copies of the minor allele of each SNP and the mean values of the adjusted lipoprotein measures. A conservative threshold of P 0.4) among the HapMap CEU, YRI, and JPN+CHB populations [37]. Discrepancy between self reported European ancestry and the clustering pattern was observed only for 68 samples ( 0.3) were used in the analysis. In the PROCARDIS study [20], where genotype data derive from the Illumina (San Diego, CA) Human 1M platform representing a superset of the SNPs in the WGHS data, lipoprotein fractions were adjusted for case/control specific effects of age at baseline (continuous), gender, country of recruitment (Germany, Italy, Sweden, United Kingdom), self-reported hypertension (yes/no), diabetes (yes/no), current smoking status by questionnaire (yes/no), and statin therapy (yes/no). Regression models assumed a linear relationship between the number of copies of the minor allele and adjusted mean lipoprotein measure. Supporting Information Figure S1 Locus p-values for lipoprotein fractions with at least one SNP reaching genomewide significance at each of the candidate loci. All plots correspond to analysis in the whole sample except for locus 8p23.1, for which genomewide association was observed only in the fasting subsample as shown. (0.41 MB PDF) Click here for additional data file. Figure S2 Primary loci clustered hierarchically according to Cartesian distance corresponding to whether ( = 1) or not ( = 0) there were associations with each of the lipoprotein fractions in the model selection procedures (see Materials and Methods). (0.02 MB PDF) Click here for additional data file. Figure S3 Dendorgram showing bierarchical relationships between loci clustered as in Figure S2. (0.01 MB PDF) Click here for additional data file. Figure S4 Normalized SNP effects (beta coefficients) from univariate regression models. All plots correspond to analysis in the whole sample except for locus 8p23.1, for which genome-wide association was detected only in the fasting subsample as shown. Locus SNPs are shown if they were retained in the model selection procedure for at least one lipoprotein fraction. Absence of shading indicates the univariate beta coefficient was not significant (p>0.05). A small black dot for some combinations of SNPs and lipoprotein fractions indicates genomewide significance for the univariate beta coefficient. (0.10 MB PDF) Click here for additional data file. Table S1 Best genome-wide associations with the lipoprotein fractions at each candidate locus. (1.10 MB DOC) Click here for additional data file. Table S2 Correlations between all pairs of lipoprotein fractions. (0.12 MB DOC) Click here for additional data file. Table S3 Replication of WGHS candidate associations from whole sample in PROCARDIS and the Framingham Heart Study. (0.56 MB DOC) Click here for additional data file. Table S4 Replication of WGHS candidate associations from fasting sub-sample in PROCARDIS and the Framingham Heart Study. (0.45 MB DOC) Click here for additional data file. Table S5 Proportion of variance in fully adjusted lipoprotein fractions explained in the whole sample by genetic variation at the candidate loci. (0.15 MB DOC) Click here for additional data file. Table S6 Proportion of variance in fully adjusted lipoprotein fractions explained in the fasting sub-sample by genetic variation at the candidate loci. (0.15 MB DOC) Click here for additional data file. Table S7 Total proportion of variance explained by candidate loci for each of the unadjusted lipoprotein fractions. (0.04 MB DOC) Click here for additional data file. Table S8 Sensitivity analysis for locus discovery procedure. (0.10 MB DOC) Click here for additional data file. Table S9 Lipoprotein associations in the whole sample at loci in previous lipid fraction GWAS. (0.34 MB DOC) Click here for additional data file. Table S10 Lipoprotein associations in the fasting sub-sample at loci in previous lipid fraction GWAS. (0.16 MB DOC) Click here for additional data file.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Novel Association of ABO Histo-Blood Group Antigen with Soluble ICAM-1: Results of a Genome-Wide Association Study of 6,578 Women

              Introduction ICAM-1 is a member of the immunoglobulin superfamily of adhesion receptors and consists of 5 immunoglobulin-like extracellular domains, a transmembrane domain and a short cytoplasmic domain. ICAM-1, present on endothelial cells, serves as a receptor for the leukocyte integrins LFA-1 (lymphocyte function-associated antigen-1) and Mac-1 (CD11b/CD18), facilitating leukocyte adhesion and migration across the endothelium [1]. A soluble form of ICAM-1 (sICAM-1) is found in plasma and consists of the extra-cellular domains of ICAM-1. Although the process leading to the formation of sICAM-1 is not entirely clear, sICAM-1 is thought to be shed from the cell membrane via proteolytic cleavage of ICAM-1. Because sICAM-1 binds to LFA-1, it is capable of inhibiting lymphocyte attachment to endothelial cells [2]. Furthermore, sICAM-1 has been shown to bind human rhinoviruses, the etiologic agent of 40–50% of common colds, and to inhibit rhinovirus infection in vitro [3]. Likewise, a circulating fragment of sICAM-1 binds to erythrocytes infected with Plasmodium falciparum, the etiologic agent of malaria [4] (MIM 611162). Finally, plasma concentration of sICAM-1 has been shown to provide unique predictive value for the risk of myocardial infarction (MIM 608446), ischemic stroke (MIM 601367), peripheral arterial disease (MIM 606787) and noninsulin-dependent diabetes mellitus (MIM 125853) in epidemiological studies [5]–[7]. Despite relatively high heritability estimates (from 0.34 to 0.59) [8],[9] for sICAM-1, few genetic variants are known to influence its concentrations. Two recent linkage studies have shown evidence for genetic association at the ICAM1 (GeneID 3383) locus (19p13.3-p13.2) [8],[9] and two candidate SNPs within the extracellular domains of ICAM-1 itself, G241R (rs1799969) and K469E (rs5498), have been correlated with circulating sICAM-1 levels [10],[11]. By contrast, a recent genome wide association study (GWAS) from the Framingham investigators involving 1006 participants and 70,987 SNPs revealed no association reaching a genome-wide level of significance, including the ICAM1 locus itself, although this study had no genetic marker within 60 kb of the gene [12]. To more comprehensively explore this issue, we performed a larger GWAS, evaluating 336,108 SNPs in 6,578 apparently healthy women. Methods Study Sample and sICAM-1 Measurements All participants in this study were part of the Women's Genome Health Study (WGHS) [13]. Briefly, participants in the WGHS include American women from the Women's Health Study (WHS) with no prior history of cardiovascular disease, diabetes, cancer, or other major chronic illness who also provided a baseline blood sample at the time of study enrollment. The WHS is a recently completed 2×2 randomized clinical trial of low-dose aspirin and vitamin E in the primary prevention of cardiovascular disease and cancer. For all WGHS participants, EDTA anticoagulated plasma samples were collected at baseline and stored in vapor phase liquid nitrogen (−170°C). Circulating plasma sICAM-1 concentrations were determined using a commercial ELISA assay (R&D Systems, Minneapolis, Minn.); the assay used is known not to recognize the K56M (rs5491) variant of ICAM-1 [14] and the 22 carriers of this mutation were therefore excluded from further analysis. This study has been approved by the institutional review board of the Brigham and Women's Hospital. Additional clinical characteristics of these subsets are provided in Table S1. Genotyping Genotyping was performed in two stages, a first sample being used to discover new associated loci and the second sample being used to validate them by replication. These two samples were genotyped independently of one another in two batches. The first (WGHS-1) and second (WGHS-2) batches included 4,925 and 2,056 self-reported Caucasian WGHS participants, respectively. No related individuals were detected when tested with an identity by state analysis [15]. Samples were genotyped with the Infinium II technology from Illumina. Either the HumanHap300 Duo-Plus chip or the combination of the HumanHap300 Duo and I-Select chips was used. In either case, the custom content was identical and consisted of candidate SNPs chosen without regard to allele frequency to increase coverage of genetic variation with impact on biological function including metabolism, inflammation or cardiovascular diseases. Genotyping at 318,237 HumanHap300 Duo SNPs and 45,571 custom content SNPs was attempted, for a total of 363,808 SNPs. Genetic context for all annotations are derived from human genome build 36.1 and dbSNP build 126. SNPs with call rates 1% in Caucasians were used for analysis. After quality control, 307,748 HumanHap300 Duo SNPs and 28,360 custom content SNPs were left, for a total of 336,108 SNPs. From the initial 4925 WGHS-1 and 2056 WGHS-2 individuals genotyped, 4582 WGHS-1 individuals and 2014 WGHS-2 individuals were kept for further analysis. Population Stratification Because population stratification can result in inflated type I error, a principal component analysis using 1443 ancestry informative SNPs was performed using PLINK [17] in order to confirm self-reported ancestry. Briefly, these SNPs were chosen based on Fst >0.4 in HapMap populations (YRB, CEU, CHB+JPT) and inter-SNP distance at least 500 kb in order to minimize linkage disequilibrium. Different ethnic groups were clearly distinguished with the two first components. Out of 4582 WGHS-1 and 2014 WGHS-2 self-identified Caucasians, 12 and 6 were removed from analysis because they did not cluster with other Caucasians, leaving 4570 (WGHS-1) and 2008 (WGHS-2) participants for analysis, respectively. Two more analyses were undertaken to rule out the possibility that residual stratification within Caucasians was responsible for the associations observed. First, association analysis was done with correction by genomic control. This method estimates the average effect of population substructure in the sample (based on median T values) and accordingly corrects the test statistics [18]. Second, a principal component analysis [19] was performed in Caucasians (only) using 124,931 SNPs chosen to have pair-wise linkage disequilibrium lower than r2 = 0.4. The first three components were then used as covariates in the association analysis. As adjustment by these covariates did not change the conclusions, we present analysis among the WGHS-1 and WGHS-2 Caucasian participants without further correction for sub-Caucasian ancestry unless stated otherwise. Association Analysis To identify common genetic variants influencing sICAM-1 levels, we first attempted to discover which loci significantly contributed to sICAM-1 concentrations in WGHS-1. Plasma concentrations of sICAM-1 were adjusted for age, smoking, menopause and body mass index using a linear regression model in R to reduce the impact of clinical covariates on sICAM-1 variance. The adjusted sICAM-1 values were then tested for association with SNP genotypes by linear regression in PLINK [17], assuming an additive contribution of each minor allele. A conservative P-value cut-off of 5×10−8 was used to correct for the roughly 1,000,000 independent statistical tests thought to correspond to all the common genetic variation of the human genome [20]. Replication of genome-wide significant associations was performed on adjusted sICAM-1 values from the replication sample (WGHS-2), using a Bonferroni correction to account for multiple hypothesis testing. Model Selection Algorithm To further define the extent of genetic associations, a forward selection linear multiple regression model was used at the previously identified loci. Briefly, all genotyped SNPs within 100 kb of the most significantly associated SNP at each replicated locus and passing quality control requirements were tested for possible incorporation into a multiple regression model. In stepwise fashion, a SNP was added to the model if its multiple regression P-value was less than 10−4 (to account for all the SNPs being considered) and if it had the smallest P-value among all the SNPs not yet included in the model. This analysis was done on WGHS-1 individuals using adjusted sICAM-1 values. We then proceeded to validate our multiple regression model in WGHS-2 samples. Using only the SNPs previously selected in WGHS-1, we added them in a multiple regression model in the same order as they were chosen in WGHS-1. We considered the model validated if each time a SNP was included in the model, its regression P-value was lower than 0.01 (to account for multiple testing) and the direction of effect consistent. Analytical Interference Assay Plasma from A blood group individuals was mixed 1∶1 or 1∶2 with a monoclonal anti-A antibody (Ortho-Clinical Diagnostics, Rochester NY), and allowed to incubate 10 minutes or 60 minutes at room temperature, or 60 minutes or 12 hours at 4°C before assaying sICAM-1 levels by the standard technique. To exclude the possibility that the antibody itself interfered with the assay, the same procedure was repeated with plasma from O blood group individuals. Finally, plasma from O group individuals, which is expected to contain both anti-A and anti-B polyclonal antibodies, was mixed with plasma from A group individuals in 1∶1 ratio, again with incubation as above and measurement of sICAM-1 levels. Results/Discussion As shown in Table 1, 19 SNPs passed our stringent genome-wide significance threshold when tested in WGHS-1 individuals, clustering within two loci in the vicinity of the ICAM1 (19p13.2) and ABO (GeneID 28) (9q34.2) genes (Figure 1). The replication threshold in WGHS-2 was conservatively set at a 2-sided P-value of 0.002, applying a Bonferroni correction to account for 19 tests. Using this cutoff, we were able to replicate 17 of the 19 associated SNPs, including SNPs at both the ICAM1 and ABO loci. Only rs2116941 (19p13.2) and rs7256672 (19p13.2) did not replicate using this standard. Nevertheless, each of these SNPs had a P-value lower than 10−9 when tested on the combined sample (i.e. WGHS-1 and WGHS-2 pooled together). Among the replicated SNPs, only rs7258015 (19p13.2) deviated from Hardy-Weinberg equilibrium (p = 0.00007), but visual inspection of the raw genotyping signal for this SNP did not reveal any obvious artifact. Major and minor alleles are shown in Table S2. 10.1371/journal.pgen.1000118.g001 Figure 1 Genetic Context of Genome-Wide Significant Associations. Genomic context for each of two loci with genome-wide association with sICAM-1 levels. The ICAM1 locus (19p13.2) is shown in Figure 1-A and the ABO locus (9q34.2) in Figure 1-B. Upper panel: Genes from RefSeq release 25. Only one isoform is shown when multiple splicing variants are known. Middle Panel: SNPs are shown according to their physical location and P-values (red dots). Also shown is the genetic distance in cM from the lowest P-value SNP (light grey line) along with the position of recombination hotspots (light grey vertical bars). Recombination rates and hotspots are based on HapMap data, as described by McVean et al.[53] and Winckler et al. [54]. Lower panel: Pair wise linkage disequilibrium (D′ and R2) between SNPs based on WGHS data. 10.1371/journal.pgen.1000118.t001 Table 1 Genome-Wide Significant SNPs for sICAM1. SNP Locus Position(kb) Nearest Gene Function MAFa HWb WGHS-1 WGHS-2 Combined Median sICAM-1 umol/L (n) Beta P-Value Beta P-Value Beta P-Value A1/A1c A1/A2c A2/A2c rs687621 9q34.2 135126.9 ABO intron 0.34 0.08 −10.88 1.3E-11 −11.21 1.4E-06 −10.95 1.3E-16 343 (2856) 334 (2894) 328 (806) rs687289 9q34.2 135126.9 ABO intron 0.34 0.07 −11.00 8.2E-12 −10.81 3.5E-06 −10.91 1.7E-16 343 (2859) 334 (2887) 328 (804) rs657152 9q34.2 135129.1 ABO intron 0.37 0.13 −10.17 1.7E-10 −9.93 1.5E-05 −10.06 1.5E-14 343 (2652) 334 (2992) 331 (905) rs500498 9q34.2 135138.5 ABO intron 0.45 0.10 9.53 5.0E-10 9.53 2.4E-05 9.52 6.1E-14 332 (2017) 337 (3160) 347 (1358) rs505922 9q34.2 135139.0 ABO intron 0.34 0.05 −11.15 4.5E-12 −10.83 3.2E-06 −11.02 9.2E-17 343 (2876) 334 (2871) 328 (798) rs507666 9q34.2 135139.2 ABO intron 0.20 0.42 −18.12 1.1E-20 −16.93 5.9E-10 −17.73 5.1E-29 343 (4221) 328 (2055) 314 (268) rs10409243 19p13.2 10194.0 EDG5 3′ UTRd 0.41 0.96 −10.93 4.4E-12 −8.22 2.6E-04 −10.11 5.4E-15 344 (2313) 335 (3158) 329 (1083) rs2116941 19p13.2 10195.4 EDG5 3′ UTRd 0.20 0.76 −12.05 7.9E-10 −7.47 7.5E-03 −10.67 3.1E-11 341 (4209) 333 (2076) 320 (249) rs8111930 19p13.2 10229.0 MRPL4 intron 0.12 0.27 18.88 4.4E-15 13.35 4.3E-05 17.10 1.3E-18 335 (5031) 346 (1369) 377 (102) rs1799969 19p13.2 10255.8 ICAM1 NSe 0.12 0.34 −26.11 2.1E-28 −33.18 1.3E-21 −28.19 3.6E-47 343 (5117) 319 (1336) 304 (99) rs5498 19p13.2 10256.7 ICAM1 NSe 0.43 0.15 13.88 6.2E-19 11.78 1.2E-07 13.22 4.8E-25 329 (2117) 338 (3123) 355 (1235) rs923366 19p13.2 10258.2 ICAM1 3′ UTRd 0.44 0.71 13.07 5.7E-17 11.05 7.9E-07 12.46 2.5E-22 329 (2088) 338 (3202) 353 (1243) rs3093030 19p13.2 10258.4 ICAM1 - 0.43 0.59 13.14 3.7E-17 11.41 2.7E-07 12.61 5.9E-23 329 (2109) 338 (3204) 353 (1243) rs281440 19p13.2 10261.3 ICAM5 - 0.22 0.69 −13.39 7.2E-13 −10.99 4.9E-05 −12.69 1.5E-16 343 (4021) 331 (2223) 322 (314) rs2075741 19p13.2 10262.1 ICAM5 intron 0.43 0.69 13.27 1.9E-17 11.40 3.2E-07 12.69 3.6E-23 329 (2104) 338 (3208) 353 (1240) rs2278442 19p13.2 10305.8 ICAM3 intron 0.35 0.65 −9.44 1.0E-08 −7.66 7.7E-04 −8.88 3.1E-11 344 (2727) 334 (3007) 332 (815) rs2304237 19p13.2 10307.6 ICAM3 NSe 0.22 0.49 10.83 1.2E-08 12.23 5.2E-06 11.26 3.9E-13 334 (3919) 342 (2129) 356 (313) rs7258015 19p13.2 10310.4 ICAM3 NSe 0.22 0.00007 10.60 3.2E-08 12.73 8.6E-06 11.19 2.1E-12 334 (3873) 342 (2275) 356 (255) rs7256672 19p13.2 10440.5 PDE4A 3′ UTRd 0.36 0.67 −9.06 2.7E-08 −6.36 5.8E-03 −8.24 6.3E-10 343 (2655) 337 (3049) 324 (845) a MAF: Minor allele frequency based on the combined samples. b HW: Deviation from Hardy-Weinberg equilibrium P-value based on the combined samples. c A1: Major Allele; A2: Minor Allele; Median sICAM-1 values were determined on the combined samples. d 3′ UTR: 3′ Untranslated Region. e NS: Non-Synonymous Coding SNP. We then applied our model selection algorithm in WGHS-1 individuals (see Methods) using 54 SNPs at 19p13.2 (ICAM1 locus) and 68 SNPs at 9q34.2 (ABO locus). As can be seen in Table 2, 3 out of 54 SNPs at 19p13.2 were selected by our algorithm and 1 out 68 SNPs at 9q34.2 was selected. All four SNPs selected in WGHS-1 were validated in WGHS-2. Pairwise linkage disequilibrium between these SNPs was low. For instance, r2 was lower than 0.35 between ICAM1 SNPs while it was lower than 0.002 between the ABO SNP rs507666 and the ICAM1 SNPs. Among these SNPs, there was no strong evidence for non-additive effects of the minor allele as judged by lack of significance for a likelihood ratio test comparing the additive regression model to an alternative genotype model with an additional degree of freedom. Interestingly, one of the four selected SNPs (rs281437) was non-significant in univariate analysis, illustrating that its inclusion in the model and significant association are conditional on the genotypes at rs5498 and rs281437. No gene-gene interaction was observed between ICAM1 and ABO SNPs. The 3 SNPs at 19q13.2 (ICAM1) collectively explained 6.9% of the total variance in sICAM-1 concentrations (pooling WGHS-1 and WGHS-2 together), whereas the ABO SNP rs507666 explained 1.5%. In comparison, clinical covariates accounted for 18.8% of the variance (Table 3), and together the candidate loci and the clinical variables accounted for 27.3% of total variance. It should be noted that the estimated effect sizes of the ICAM1 and ABO loci are minimums since the genotyped variants might not be the actual functional variants. 10.1371/journal.pgen.1000118.t002 Table 2 Multiple Linear Regression Statistics of SNPs Retained by the Forward Model Selection Algorithm. Locus SNP Nearest Gene Function MAFa HWb WGHS-1c WGHS-2c Combinedc Beta P-Value Beta P-Value Beta P-Value 19p13.2 rs1799969 ICAM1 NSd 0.12 0.34 −41.2 <2.0E-16 −49.3 <2.0E-16 −43.5 <2.0E-16 rs5498 ICAM1 NSd 0.43 0.15 30.4 <2.0E-16 28.3 <2.0E-16 29.7 <2.0E-16 rs281437 ICAM1 3′ UTRe 0.30 0.47 11.2 1.3E-08 7.9 5.1E-03 10.1 3.2E-10 9q34.2 rs507666 ABO intron 0.20 0.42 −16.8 <2.0E-16 −16.8 1.9E-10 −16.7 <2.0E-16 a MAF: Minor allele frequency based on the combined samples. b HW: Deviation from Hardy-Weinberg equilibrium P-value based on the combined samples. c All analyses were performed using adjusted sICAM1 values (see text for details). Beta coefficients and P-values were derived form a multiple linear model that included all 4 SNPs. d NS: Non-Synonymous Coding SNP. e 3′ UTR: 3′ Untranslated Region. 10.1371/journal.pgen.1000118.t003 Table 3 Partition of sICAM-1 Variance According to Genetic and Clinical Variables. Category Variable Variable R2 Category R2 Clinical Covariates Age 0.012 0.188 Body Mass Index 0.035 Menopause Status 0.006 Smoking 0.135 9q34.2 (ABO) Locus rs507666 0.015 0.015 19p13.2 (ICAM1) Locus rs1799969 0.026 0.069 rs5498 0.039 rs281437 0.005 TOTAL 0.273 The 3 SNPs at the 19p13.2 (ICAM1) locus selected by our algorithm were also used in haplotype analysis using WHAP [21], as implemented in PLINK [17] (Table 4). The estimate of the proportion of variance attributable to haplotypes, as well as their regression coefficients, is consistent with the linear model of these same SNPs, reinforcing the adequacy of a strictly additive model to explain the association. 10.1371/journal.pgen.1000118.t004 Table 4 Haplotype Analysis of rs1799969, rs5498 and rs281437 (19p13.2; ICAM1 Locus). Haplotype Frequency Beta P-Value rs1799969 rs5498 rs281437 A G G 0.12 −43.91 1.7E-95 G A G 0.27 −30.16 3.0E-78 G A A 0.30 −19.98 5.3E-37 G G G 0.32 Reference - Omnibus (3 df) p-value = 1.0E-124. WGHS-1 and WGHS-2 were combined for this analysis. The ABO histo-blood group antigen is the most important blood group system in transfusion medicine. Using data from Seattle SNPs (http://pga.mbt.washington.edu) as well as from the Blood Group Antigen Mutation Database (www.ncbi.nlm.nih.gov), it can be demonstrated that rs507666 is a perfect surrogate for type A1 histo-blood group antigen. Moreover, using rs687289 as a marker for the O allele, rs8176746 for the B allele and rs8176704 for the A2 allele, complete blood group antigen phenotype can be re-constructed by haplotype analysis (no serotype data is available in WGHS). Imputed haplotypes perfectly fitted the pattern expected from the literature and their association with sICAM-1 is shown in Tables 5 and 6. The A1 allele is associated with the lowest sICAM-1 concentrations while the A2 allele is associated with low concentrations, intermediate between the A1 and O allele. In comparison, the B allele is associated with slightly higher concentrations than the O allele. 10.1371/journal.pgen.1000118.t005 Table 5 Association of sICAM-1 Concentrations (µmol/L) with Histo-Blood Group Antigen Alleles (9q34.2; ABO Locus). Allele Genotype Frequency Beta rs8176746 rs8176704 rs687289 rs507666 A1 C G A A 0.20 −14.3 A2 C A A G 0.07 −2.2 B A G A G 0.08 4.7 O C G G G 0.66 3.9 Omnibus (3 df) p-value = 2.0E-28. WGHS-1 and WGHS-2 were combined for this analysis. 10.1371/journal.pgen.1000118.t006 Table 6 Mean (SD) Level of sICAM-1 (µmol/L) According to Predicted ABO Alleles. First Allele A1 A2 O B Second Allele A1 326 (71) N = 268 332 (71) N = 185 338 (77) N = 1656 352 (89) N = 182 A2 - 346 (55) N = 33 352 (80) N = 537 366 (91) N = 56 O - - 358 (81) N = 2859 356 (80) N = 592 B - - - 342 (79) N = 51 WGHS-1 and WGHS-2 were combined for this analysis. Mean (SD) values were derived from unadjusted sICAM-1 concentrations. Because ABO histo-blood group antigens are known to vary in frequency among Caucasian sub-populations, we sought to investigate the potential effect of population stratification on the observed association even though adjustment of sICAM-1 values for the top ten components of our principal component analysis did not change our conclusions (see Methods). Visual inspection of the clustering pattern from the top two components confirmed a match with previously published work of sub-Caucasian stratification [22] (data not shown). Since these two components were reproducibly shown to correspond to a Northwest-Southeast European gradient [22] and the A1 allele follows such a gradient [23], we hypothesized that they would be tightly linked to A1 allele frequencies. Indeed, the second component showed evidence of association with A1 allelic frequencies (p = 2.5×10−6), while the first component was only weakly associated (p = 0.08). Nevertheless, neither the first nor second component was very tightly linked to sICAM-1 values (p = 0.69 and 0.0006 respectively with corresponding R2 of 3.8×10−5 and 0.0019), implying that stratification has no major effect on the sICAM-1 association. Furthermore, the weak association with the second component could be partially explained by the correlation with A1 alleles, with corrected P-value of 0.004 and R2 of 0.0013. Adjustment of sICAM-1 values for the first and second components did not substantially change the association between the A1 allele and sICAM-1 (unadjusted p = 5.1×10−29 and adjusted p = 5.5×10−28), demonstrating that stratification on a Northwest-Southeast European axis is not responsible for the association. We conclude that the data does not support the hypothesis that Northwest-Southeast sub-Caucasian stratification is responsible for the association of ABO variants with sICAM-1 concentrations since the A1 allele varies in frequency according to a Northwest-Southeast European axis while the slight variation in sICAM-1 among this same axis is at least partially dependent on the A1 allele. Indeed, there is no evidence in the literature that mean sICAM-1 concentrations vary at all among Caucasian sub-populations, and this lack of evidence is supported by an overall R2 of 0.005 (P-value of 0.0007) for the association between sICAM-1 concentrations and the top 10 principal components. The Secretor phenotype (as defined by rs601338 on chromosome 19q13.33) and the Lewis antigen phenotype (as defined by rs812936 on chromosome 19p13.3) are additional important members of the histo-blood group antigen system. These were therefore tested for association with sICAM-1 levels as well as for interaction with rs507666. No significant effect was observed. Although the sICAM-1 molecule itself is not known to bear the ABO histo-blood group antigen, this possibility could not be ruled out, especially given its extensive glycosylation [24],[25]. We therefore sought to exclude the remote chance that the association between A histo-blood group antigen and lower sICAM-1 values was the consequence of a lower affinity of the antibodies used in the sICAM-1 assay for sICAM-1 carrying the A antigen. In other words, if sICAM-1 does carry ABO histo-blood group antigen, then the allelic composition at the ABO locus could dictate the glycosylation status of the sICAM-1 molecule and possibly interfere with the immunoassay used. While there is no evidence that the two plasma proteins known to contain ABO histo-blood group antigen (von Willebrand factor and alpha 2-macroglobulin) [26] suffer from such analytical interference, immunoassays are potentially susceptible to differential glycosylation of their target protein [27]. We thus hypothesized that blocking the A antigen sites with either polyclonal or monoclonal antibodies would result in spuriously low sICAM-1 values if sICAM-1 does indeed carry ABO histo-blood group antigen and if the A antigen is located in the vicinity of one of the two antibody binding sites used by the immunoassay. No differential effects of the mixing procedures (see Methods) were observed suggesting that the A blood group antigen was not interfering with measurement of sICAM-1 levels. We therefore conclude that the genetic association of the ABO variant is not due to analytic interference. However, we can not exclude that sICAM-1 bears the ABO histo-blood group antigen. Finally, we sought to assess the presence of other associations that did not pass our stringent genome-wide P-value cut-off. We therefore repeated the whole-genome association analysis on the combined sample (i.e. WGHS-1 and WGHS-2 pooled together). While no new locus was associated at a genome-wide level, rs9889486 had the lowest p-value (outside of 9q34.2 and19p13.2; p = 3.2×10−6) with a false discovery rate [28] of 0.03. This SNP is intronic to CCDC46 (GeneID 201134) (17q24.1), a gene whose function is not well characterized. Among other low p-value SNPs, we note rs1049728 (p = 1.3×10−5) with a false discovery rate of 0.08 and the 51st most strongly associated SNP overall. This SNP is located in the 3′ untranslated region of RELA (GeneID 5970) (11q13.1), which is part of the NFKB signaling complex, arguably the most important known regulator of ICAM1 expression [29]. The non-synonymous coding ICAM1 SNPs rs1799969 (G241R) and rs5498 (K469E) were previously described as being associated with sICAM-1 levels[10],[11] whereas the association involving rs281437 is unreported. The later SNP is in the 3′ untranslated region of ICAM-1. Of interest, the minor allele of rs1799969 (arginine) is correlated with lower sICAM-1 and has been associated with lower risk of type I diabetes[30], while the minor allele of rs5498 (glutamic acid) is correlated with higher sICAM-1 levels and has been associated with lower risk of asthma [11] (MIM 600807), inflammatory bowel disease [31] (MIM 266600) and type I diabetes [32] (MIM 222100). Furthermore, it has been demonstrated in vitro that this SNP affects ICAM-1 mRNA splicing pattern and apoptosis in human peripheral blood mononuclear cells [33]. It is also noteworthy that sICAM-1 has been shown to inhibit insulitis and onset of autoimmune diabetes in a mouse model of type I diabetes [34] whereas ICAM1 itself was proven to be crucial to the priming of T cells against beta cells [35]. The most striking result of this report is the association between sICAM-1 levels and rs507666, a SNP intronic to the ABO gene. The ABO gene encodes glycosyltransferase enzymes which transfer specific sugar residues to a precursor substance, the H antigen. There are three major alleles at the ABO locus: A, B and O. Variation at the ABO locus is remarkable in that these alleles encode enzymes with different specificities as well as activities. The A allele encodes the enzyme alpha1→3 N-acetylgalactosamyl-transferase which forms the A antigen from the H antigen. The A allele (as well as the B and O alleles) is itself heterogeneous and comprises several subgroups, of which A1 and A2 are the most important. As compared to A1, the A2 allele has 30–50 fold less A transferase activity [36]. The B allele encodes the enzyme alpha1→3 galactosyltransferase which forms the B antigen from the H antigen. The O allele does not produce an active enzyme [37]. Consistent with the A antigen being associated with lower sICAM-1 concentrations and with the A1 allele having 30–50 fold more A transferase activity than the A2 allele, the A1 allele is associated with the lowest sICAM-1 concentrations while the A2 allele is associated with low concentrations as well, but still higher than the A1 allele (Table 5). Although we excluded the possibility of an analytical interference to explain the association, the exact mechanism linking histo-blood group antigen to sICAM-1 concentrations remains elusive. Among the different hypotheses, it remains possible that sICAM-1 bears the A antigen, a modification that might increase its clearance by increasing its affinity for its receptor(s) and/or decrease its secretion, perhaps by decreasing its affinity for the protease(s) producing sICAM-1 from membrane-bound ICAM1. Alternatively, lower sICAM-1 concentrations might be the result of the presence of the A antigen on its receptor(s) and/or protease(s). ABO histo-blood group phenotype has been linked to a plethora of diseases, including infectious diseases, cancers and vascular diseases [38]. Particularly interesting is the association of non-O histo-blood groups — and group A in particular [39],[40] — with a higher risk of myocardial infarction, peripheral vascular disease, strokes and venous thromboembolism [41] (MIM 188050). While this phenomenon is partially explained by higher concentrations of the coagulation factors vonWillebrand and VIII (presumably because of decreased clearance) [42], the exact mechanism is not entirely understood. Underlining the complex nature of the biological processes involved, the A1 group (rs507666) is associated with lower levels of sICAM-1, a (positive) predictor of vascular diseases in epidemiological studies [5], [6], [43]–[46]. Among potential explanations as to this apparent disparity, it is possible that decreased sICAM-1 leads to increased adhesion of leukocytes on endothelial surface and therefore increased vascular inflammation, an important component of atherosclerosis [47]. Moreover, because group A individuals have been shown to have higher blood cholesterol [48] and coagulability [42], the decrease in sICAM-1 seen in these individuals could be offset by the increased susceptibility to vascular diseases conferred by these risk factors, even if sICAM-1 mechanistically causes these diseases. Alternatively, sICAM-1 might merely be a marker of increased inflammation and coagulation [49], both risk factors for vascular diseases. Also of special interest, group A antigen carriers have been recognized as having a higher risk of suffering from severe malaria when infected by Plasmodium falciparum [50]. Plamodium infected erythrocytes express a receptor (PfEMP-1) that binds specifically to cell-surface group A and B antigen as well as ICAM-1 [51], a major step in the sequestration of infected erythrocytes leading to the clinical complications of severe and cerebral malaria. The lower concentrations of sICAM-1 found in A1 group carriers could therefore be hypothesized to contribute to this higher risk either directly, if sICAM-1 can inhibit the sequestration process, or indirectly, if sICAM-1 levels reflect differences in the processing of the ICAM1 receptor itself. Several limitations warrant discussion. First, this study was conducted in Caucasian women. It is therefore difficult to generalize our results to other ethnicities or to men. Second, effect estimates derived from this study might be higher than in other populations as these are initial findings and because of the winner's curse [52]. Third, although we were able to rule out a technical artifact as the cause of our results, no mechanistic link is identified to explain the association between ABO histo-blood groups and sICAM-1. In particular, one pending question is whether or not ICAM-1 bears any ABO antigen at all. In this report, we demonstrate that sICAM-1 concentrations are associated with genetic variation at the ABO and ICAM1 loci in women. To our knowledge, this represents the first published genetic evidence that ABO may have a regulatory role on an inflammatory mediator, a finding with potential implication on a diverse array of immune-mediated disorders. Especially interesting is the fact that both ABO and ICAM1 have been previously related to vascular disease and malaria, two major causes of mortality and morbidity worldwide. The current study indicates a genetic link between histo-blood group antigen and inflammatory adhesion processes, providing the basis for physiological studies of this interaction. Supporting Information Table S1 Clinical Characteristics of the Samples Used. (0.04 MB DOC) Click here for additional data file. Table S2 Major and Minor Alleles. (0.04 MB DOC) Click here for additional data file.
                Bookmark

                Author and article information

                Journal
                Int J Vasc Med
                Int J Vasc Med
                IJVM
                International Journal of Vascular Medicine
                Hindawi Publishing Corporation
                2090-2824
                2090-2832
                2012
                22 October 2012
                : 2012
                : 641917
                Affiliations
                1Cardiovascular Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104-6160, USA
                2Institute for Translational Medicine and Therapeutics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104-5158, USA
                Author notes

                Academic Editor: Masaki Mogi

                Article
                10.1155/2012/641917
                3485501
                23133757
                253de7eb-8bf7-471c-bd18-9ca8e2247e4c
                Copyright © 2012 Hanrui Zhang et al.

                This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 20 June 2012
                : 25 August 2012
                : 1 September 2012
                Categories
                Review Article

                Cardiovascular Medicine
                Cardiovascular Medicine

                Comments

                Comment on this article