373
views
0
recommends
+1 Recommend
0 collections
    13
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk

      research-article
      1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 4 , 4 , 5 , 4 , 5 , 5 ,   5 , 14 , 15 , 16 , 17 , 18 , 19 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 5 , 27 , 28 , 29 , 30 , 31 , 31 , 32 , 6 , 7 , 10 , 33 , 34 , 35 , 13 , 36 , 37 , 38 , 21 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 3 , 6 , 7 , 6 , 7 , 11 , 11 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 4 , 14 , 15 , 15 , 57 , 13 , 47 , 58 , 59 , 60 , 36 , 8 , 61 , 62 , 47 , 63 ,   17 , 64 , 37 , 6 , 65 , 38 , 23 , 13 , 13 , 47 , 64 , 66 , 67 , 68 , 3 , 2 , 69 , 70 , 71 , 72 , 62 , 73 , 4 , 74 , 8 , 75 , 76 , 77 , 8 , 10 , 78 , 4 , 25 , 79 , 80 , 81 , 82 , 66 , 83 , 36 , 22 , 52 , 84 , 85 , 86 , 87 , 88 , 4 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 ,   36 , 98 , 99 , 100 , 101 , 79 , 38 , 13 , 13 , 10 , 102 , 103 , 9 , 1 , 39 , 6 , 104 , 105 , 4 , 106 , 21 , 28 , 13 , 45 , 47 , 66 , 83 , 65 , 47 , 4 , 107 , 108 , 65 , 45 , 109 , 110 , 82 , 47 , 5 , 27 , 42 , 96 , 111 , 112 , 67 , 68 , 82 , 113 , 23 , 114 , 8 , 115 , 44 , 116 , 117 , 118 , 4 , 5 , 119 , 96 , 111 , 22 , 120 , 81 , 121 , 122 , 123 , 3 , 124 , 65 , 125 , 126 , 10 , 127 ,   3 , 98 , 55 , 56 , 22 , 120 , 128 , 64 , 66 , 129 , 130 , 131 , 17 , 47 , 125 , 132 , 133 , 134 , 21 , 135 , 60 , 136 , 87 , 137 , 22 , 120 , 70 , 138 , 120 , 8 , 139 , 25 , 140 , 141 , 96 , 111 , 142 , 19 , 64 , 66 , 5 , 27 , 28 , 24 , 23 , 22 , 143 , 5 , 8 , 79 , 43 , 144 , 9 , 3 , 120 , DIAGRAM Consortium, GIANT Consortium, Global BPgen Consortium, 44 , 3 , 80 , 42 , 104 , 105 , 69 , 105 , 98 , 96 , 111 , 146 ,   38 , 73 , 73 , 67 , 68 , 4 , 107 , 45 , 125 , 34 , 39 , 109 , 132 , 64 , 66 , 84 , 147 , 148 , 149 , 150 , 36 , 33 , 151 , 152 , 153 , 154 , 155 , 155 , 31 , 156 , 31 , 59 , 157 , 82 , 158 , 159 , 21 , 160 , 161 , 21 , 24 , 162 , 163 , 25 , 60 , 24 , 24 , 128 , 164 , 164 , 165 , 47 , 166 , 94 , 167 , 167 , 6 , 7 , 104 , 105 , 62 , 168 , 140 , 23 , 8 , 8 , 9 , 28 , 169 , 19 , 19 , 170 , 19 , 170 , 22 , 22 , 65 , 65 , 171 , 47 , 65 , 172 , 17 , 93 , 173 , 26 , 141 , 8 , 48 , 96 , 111 , 112 , 26 , 10 , 3 , 40 , 41 , 13 , 142 , 164 , 174 , 35 , 105 , 102 , 10 , 4 , 5 , 107 , 6 , 7 , 104 , 105 , 11 , for the MAGIC investigators
      Nature genetics

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Circulating glucose levels are tightly regulated. To identify novel glycemic loci, we performed meta-analyses of 21 genome-wide associations studies informative for fasting glucose (FG), fasting insulin (FI) and indices of β-cell function (HOMA-B) and insulin resistance (HOMA-IR) in up to 46,186 non-diabetic participants. Follow-up of 25 loci in up to 76,558 additional subjects identified 16 loci associated with FG/HOMA-B and two associated with FI/HOMA-IR. These include nine new FG loci (in or near ADCY5, MADD, ADRA2A, CRY2, FADS1, GLIS3, SLC2A2, PROX1 and FAM148B) and one influencing FI/HOMA-IR (near IGF1). We also demonstrated association of ADCY5, PROX1, GCK, GCKR and DGKB/TMEM195 with type 2 diabetes (T2D). Within these loci, likely biological candidate genes influence signal transduction, cell proliferation, development, glucose-sensing and circadian regulation. Our results demonstrate that genetic studies of glycemic traits can identify T2D risk loci, as well as loci that elevate FG modestly, but do not cause overt diabetes.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Six new loci associated with body mass index highlight a neuronal influence on body weight regulation.

          Common variants at only two loci, FTO and MC4R, have been reproducibly associated with body mass index (BMI) in humans. To identify additional loci, we conducted meta-analysis of 15 genome-wide association studies for BMI (n > 32,000) and followed up top signals in 14 additional cohorts (n > 59,000). We strongly confirm FTO and MC4R and identify six additional loci (P < 5 x 10(-8)): TMEM18, KCTD15, GNPDA2, SH2B1, MTCH2 and NEGR1 (where a 45-kb deletion polymorphism is a candidate causal variant). Several of the likely causal genes are highly expressed or known to act in the central nervous system (CNS), emphasizing, as in rare monogenic forms of obesity, the role of the CNS in predisposition to obesity.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            R: a language and environment for statistic computing

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Mapping the Genetic Architecture of Gene Expression in Human Liver

              Introduction Recent large-scale, genome-wide association studies have now delivered a number of novel findings across a diversity of diseases, including age-related macular degeneration [1–3], heart disease [4,5], host control of HIV-1 [6], type I and II diabetes [7,8], and obesity [9]. However, despite this astonishing rate of success, the major challenge still remains to not only confirm that the genes implicated in these studies are truly the genes conferring protection from or risk of disease, but to elucidate the functional roles that these implicated genes play with respect to disease. Most of the genetic association studies reporting novel, highly replicated associations to disease traits do not provide experimental data supporting the putative functional roles a given candidate susceptibility gene may play in disease onset or progression. Even in cases where susceptibility genes are well studied, with well known functions, nailing down how these genes confer disease susceptibility can linger for years, or even decades, as has been the case for genes like ApoE, an Alzheimer disease susceptibility gene identified more than 15 years ago [10]. Complex networks of molecular phenotypes—gene expression (mRNA, ncRNA, miRNA, and so on), protein expression, protein state, and metabolite levels—respond more proximally to DNA variations that lead to variations in disease-associated traits. These intermediate phenotypes respond to variations in DNA that in turn can induce changes in disease associated traits. Because a majority of single nucleotide polymorphisms (SNPs) detected as associated with disease traits from the recent wave of genome-wide association studies (GWASs) do not appear to affect protein sequence, it is likely that these SNPs either regulate gene activity at the transcript level directly or link to other DNA variations involved in this type of regulatory role. Therefore, to uncover the genetic determinants affecting expression in a metabolically active tissue that is relevant to the study of obesity, diabetes, atherosclerosis, and other common human diseases, we profiled 427 human liver samples on a comprehensive gene expression microarray targeting more than 39,000 transcripts, and we genotyped DNA from each of these samples at 782,476 unique SNPs. The relatively large sample size of this study and the large number of SNPs genotyped provided the means to assess the relationship between genetic variants and gene expression with more statistical power than many previous studies allowed [11–13]. A comprehensive analysis of the liver gene expression traits revealed that thousands of these traits are under the control of well-defined genetic loci, with many of the genes having already been implicated in a number of human diseases. Here we demonstrate directly how integrating genotypic and expression data in mouse and human can provide much-needed functional support for candidate susceptibility genes identified in a growing number of genetic loci that have been identified as key drivers of disease from GWASs. Specifically, we highlight how the gene RPS26 and not ERBB3 is most strongly supported by our data as a susceptibility gene for a novel type 1 diabetes (T1D) locus that was recently identified in a large-scale GWAS [14] and subsequently extensively replicated in a number of cohorts [15]. We also identify SORT1 and CELSR2 as candidate susceptibility genes for a locus recently associated with coronary artery disease [16] and plasma low-density lipoprotein (LDL)-cholesterol levels [17,18]. Results To characterize the genetic architecture of gene expression in human liver, we compiled a tissue-specific human liver cohort (HLC), which comprised 427 Caucasian subjects (Table S1). DNA and RNA were isolated from all liver tissue samples. Each RNA sample was profiled on a custom Agilent 44,000 feature microarray composed of 39,280 oligonucleotide probes targeting transcripts representing 34,266 known and predicted genes, including high-confidence, noncoding RNA sequences. Each DNA sample was genotyped on the Affymetrix 500K SNP and Illumina 650Y SNP genotyping arrays. Analysis was restricted to those SNPs that had a genotyping call rate greater than 75%, a minor allele frequency greater than 4%, and that did not deviate significantly from Hardy-Weinberg equilibrium in the HLC. A total of 310,744 and 557,240 SNPs met these criteria from the Affymetrix and Illumina sets, respectively, resulting in a set of 782,476 unique SNPs (85,508 SNPs were in the intersection), referred to here as the analysis SNP set. Genome-Wide Screen for Putative cis- and trans-Acting Expression Quantitative Trait Loci To identify expression quantitative trait loci (eQTL) that have putative cis and trans [19] regulatory effects on the liver gene expression traits, we tested all expression traits for association with each of the SNPs in the analysis SNP set typed in the HLC. The strongest putative cis eQTL for a given expression trait was defined as the SNP most strongly associated with the expression trait over all of the SNPs typed within 1 megabase (Mb) of the transcription start or stop of the corresponding structural gene. The association p-values were adjusted to control for testing of multiple SNPs and expression traits using two different methods: (1) a highly conservative Bonferroni correction method to constrain the study-wise significance level, and (2) an empirical false discovery rate (FDR) method that constrains the overall rate of false positive events. For cis eQTL, we only test for associations to SNPs that are within 1 Mb of the annotated start or stop site of the corresponding structural gene. To achieve a study-wise significance level of 0.05, the Bonferroni adjusted p-value threshold was computed as , where Ni denotes the number of SNPs tested for trait i, over all 39,280 expression traits tested. At this threshold, 1,350 expression traits corresponding to 1,273 genes were identified. The Bonferroni adjustment method can be conservative when there is dependence among the expression traits and among the SNP genotypes. Given that strong correlation structures exist among expression traits and among SNP genotypes in a givenlinkage disequilibrium (LD) block, the Bonferroni adjustment may be overly conservative. Therefore, we used an empirical FDR method based on permutations that accounts for the correlation structures among the expression traits and among the SNP genotypes. We constrained the empirically determined FDR to be less than 10% (see Methods). At this level, we identified 3,210 expression traits corresponding to 3,043 genes that were significantly associated with at least one SNP near the corresponding gene region (referred to here as a putative cis eQTL). The full list of association results are provided in Table S2. The magnitude of the effects ranged from SNPs that explained roughly 2% of the in vivo expression variation (p ∼ 0.003) to those that explained roughly 90% of the expression variation (p 100 kb away. That is, greater than 30% of all cis eSNPs fall greater than 100 kb away from the transcription start and stop sites of the corresponding gene. Therefore, at least for expression traits, the nearest SNP rule for inferring genes given an association finding would result in an unacceptably high miss-call rate. Genes with expression values that are strongly associated with variations in DNA provide a different path to elucidate the gene or genes and their respective functions underlying genetic loci associated with disease in a more objective fashion. Identifying candidate susceptibility genes for T1D. In one of the largest GWASs carried out to date, the Wellcome Trust Case Control Consortium (WTCCC) studied 14,000 cases and 3,000 shared controls with respect to seven common diseases [14]. T1D was one of the key disease focuses of this study, with a number of replications reported simultaneously in a separate follow-up study [15]. In addition, a number of T1D susceptibility genes identified prior to the WTCCC study have been identified and more thoroughly replicated, including the HLA class II genes INS, CD25, CTLA4, PTPN22, and IFIH1. Given that the SNPs genotyped in the WTCCC study were also genotyped in the HLC, we examined the extent to which the T1D SNPs identified in the WTCCC study were associated with the expression traits corresponding to the genes implicated in the study. Table 1 highlights nine genes previously identified as T1D susceptibility genes or inferred as T1D susceptibility genes from the WTCCC study. The expression levels for five of these genes (CTLA4, HLA-DRB1, IL2RA, LONRF2, and CHST10) in the HLC were associated with the corresponding T1D-associated SNP. For IL2RA and the four other genes (AFF3, ADAD1, PTPN2, and IL2), the expression levels were associated with other SNPs in the region of the T1D-associated SNPs. We also examined whether other genes in the vicinity of the T1D-associated SNPs had expression levels that were also associated with these SNPs. An additional four genes highlighted in Table 1 (RPS26, CLECL1, IGF2AS, and Hct1837134) were identified in this way, in addition to two HLA class II genes highlighted in Table 2 (HLA-DQB1 and HLA-DQA2). Given the role that HLA class II genes are known to play in T1D, we also examined the 14 HLA class II gene expression traits represented on the array used in this study, and we found that 11 of them gave rise to significant genetic associations (Table 2). Some of the associations were striking and highlight additional SNPs that may be of interest in genetic disease association studies. For example, greater than 50% of the HLA-DRB5 expression variation observed in the HLC could be explained by a single cis eSNP (rs9271366). Table 1 Expression Traits Corresponding to Genes Implicated in the T1D WTCCC Study [14,15] or Close to Genes Associated with Either SNPs That Were Associated with T1D in the WTCCC Study or with SNPs Close the T1D-Associated SNPs Table 2 Significant Associations Detected in the HLC for 11 of the 14 HLA Class II Gene Expression Traits Represented on the Microarray Used in This Study The absence of an association between a T1D-associated SNP and the HLC expression values corresponding to a candidate susceptibility gene for that SNP cannot be taken as strong evidence against the gene's candidacy as a susceptibility gene. The underlying causal change in DNA may not affect expression levels of the gene in question, or the variation in expression may be specific to a given tissue not profiled or to conditions not reflected in the HLC. However, strong associations between T1D-associated SNPs and expression levels of genes near the SNP provide direct functional support for a gene's involvement in disease susceptibility. For example, rs3764021 was identified as a T1D susceptibility locus in the WTCCC study and then extensively replicated [15]. CLEC2D was inferred as the most likely susceptibility gene at this locus. However, CLEC2D expression in the HLC data was not associated with this SNP; but a flanking gene, CLECL1 was significantly associated (p = 5.78 × 10−17; Table 1). Given that CLEC2D and CLECL1 are in the same gene family, the strong association between the T1D SNP and CLECL1 expression data suggest that CLECL1 may be a better candidate susceptibility gene to examine. In cases where disease-associated traits and expression traits are scored in the same cohort, there is the potential to directly infer causal relationships between genes and disease [25]. However, even without disease trait data in tissue-specific cohorts like the HLC, an integrative genomics approach can be used to identify the most likely candidate susceptibility gene for a given locus. For example, one of the more novel regions associated with T1D from the WTCCC study was Chromosomes 12q13 (rs2292239). ERBB3, a receptor tyrosine-protein kinase with a presumed role in immune signaling, was identified as the most plausible susceptibility gene at this locus. While ERRB3 expression in the HLC was not associated with this SNP, the expression of a flanking gene, RPS26, was significantly associated with this SNP (p = 4.03 × 10−22; Table 1). In fact, 40% of the in vivo expression variation for RPS26 in the HLC was explained by this single T1D associated SNP, and this SNP was the most strongly associated with RPS26 expression out of the greater than 800,000 SNPs genotyped in the HLC, . The association to RPS26 expression suggests that this gene warrants further study in the context of T1D. However, these data on their own are still far from conclusive, given there may be DNA variants that affect RPS26 expression independently of T1D, but where these variants are in strong LD with the DNA variants explaining the T1D susceptibility. Therefore, to further explore the role RPS26 and ERBB3 may play in T1D, we examined the expression data for these genes in an expression atlas for human, monkey, and mouse, where for each species, between 45 and 60 tissue samples were profiled [26,27]. Although both genes are expressed in mouse, monkey, and human tissues, the expression of RPS26 is >1–2 log units higher in the pancreas and islets of Langerhan compared to ERBB3 (Figure S1), with ERBB3 observed as lowly expressed in islets as measured in the mouse body atlas. Given the central role that pancreas and islets play in T1D, these results further suggest RPS26 as a candidate susceptibility gene for T1D. What the genetic association and atlas data lack is a more refined context within which to assess the functional role a given gene plays in a system. We have previously described a method to reconstruct probabilistic, causal networks by integrating genetic and gene expression data [25,28–30]. Examining candidate susceptibility genes in the context of these networks can provide insights into the pathways in which they operate. We constructed whole-gene networks from three F2 intercross populations constructed from the B6, C3H, and CAST strains (see Methods for details). Liver and adipose expression data were generated from these populations and integrated with the genotypic data also generated in these populations to reconstruct the networks as previously described [28,30]. We then examined RPS26 and ERBB3 in the context of these networks (Figure 1). Figure 1 Local Networks for Rps26 and Erbb3 Derived from Causal, Probabilistic Whole-Gene Networks Constructed from the Liver, Adipose, Muscle, and Brain Gene Expression Data Generated from the BXH/wt and BXC Mouse Crosses (A) The Rps26 subnetwork includes a number of known T1D associated genes (green nodes), and RPS26 in this subnetwork is directly linked to H2-Eb1, a mouse ortholog of HLA-DRB1, a previously identified T1D susceptibility gene that is also strongly associated with a cis eSNP in the HLC (Table 2). The known T1D genes annotated by the Gene Ontology are significantly enriched in this subnetwork (Table 3). (B) The Erbb3 subnetwork is not associated with any pathways known or predicted to be involved in T1D. Figure 1A highlights how RPS26 is directly connected to a number of known T1D genes. For example, RPS26 is directly connected to a mouse ortholog of HLA-DRB1, a gene previously associated with T1D and highlighted in this present study as having liver expression values that are strongly associated with a highly replicated T1D SNP (Table 1). In fact, the genes comprising the local network structure around RPS26 are enriched for genes annotated as T1D genes, in addition to being enriched for genes operating in a number of pathways commonly associated with T1D (Table 3). On the other hand, whereas ERBB3 also resided in the context of a well defined subnetwork (Figure 1B), the genes comprising this subnetwork were not enriched for any T1D associated pathways. Table 3 GO Biological Process Categories Enriched in the RPS26 Subnetwork Depicted in Figure 1A Identifying candidate susceptibility genes for coronary artery disease and LDL cholesterol levels. Another GWAS involving the WTCCC resulted in the identification of seven loci associated with coronary artery disease (CAD) [16]. The seven top-hitting SNPs associated with CAD at each of the seven loci in this study were represented on the Affymetrix 500K array. Therefore, we examined the HLC data to identify expression traits that were significantly associated with any of the seven CAD-associated SNPs. Given the roughly 40,000 expression traits examined at each of the seven SNPs (280,000 tests in all), we set a nominal p-value threshold of 0.05/280,000 = 1.79 × 10−7 for significance. Only one of the seven SNPs identified in the WTCCC CAD study, rs599839 on Chromosome 1p13.3, was significantly associated with any of the HLC expression traits (Figure 2). Four different expression traits were identified as significantly associated with rs599839 (Table 4). One of the four expression traits corresponded to a gene, PSRC1, that had been identified as a candidate susceptibility gene in the WTCCC CAD study [16]. Figure 2 PSRC1, CELSR2, and SORT1 Liver Expression Is Associated with a CAD Risk Allele and Plasma LDL Cholesterol Levels The CAD risk allele for SNP rs599839 was established in a previous WTCCC study [16] (lilac panel). In the HLC, this same SNP is strongly associated with PSRC1, CELSR2, and SORT1 expression, with the CAD risk allele associated with lower relative expression (pink panel). In the BXH/wt cross designed to study metabolic traits that increase cardiovascular risk (green panel), all three of these expression traits were strongly correlated with plasma LDL cholesterol levels, a major CAD risk factor (scatter plots associated with the green panel). Given the association of these genes to plasma LDL-cholesterol levels, we examined whether rs599839 was associated with LDL cholesterol in a previously published GWAS [35] and found this SNP was significantly associated with LDL cholesterol levels, where the CAD risk allele was associated with higher LDL cholesterol levels in this cohort. Lower levels of CELSR2 and SORT1 expression were associated with the risk allele in humans, and with higher LDL cholesterol levels in mouse, making them ideal candidate susceptibility genes for the CAD and LDL cholesterol associations to this locus. On the other hand, lower levels of PSRC1 expression were associated with the risk allele in humans, but with lower LDL cholesterol levels in mouse, suggesting that PSRC1 is not the gene increasing CAD risk, but instead may be acting to protect against it. Table 4 Significant Associations Detected between Liver Expression Traits in the HLC and the CAD-Associated SNP, rs599839, on Chromosome 1p13.3 To further characterize the association of these four expression traits with CAD-associated traits, we examined the activity of these genes in the BXH/wt cross (see Methods for details), a cross designed specifically to study metabolic traits that increase risk of cardiovascular disease. The liver expression levels of Psrc1, Sort1, and Celsr2, but not Sypl2, in the BXH/wt cross were significantly associated with plasma LDL cholesterol levels (Table 4), a major CAD risk factor. However, while Psrc1 expression levels were positively correlated with plasma LDL cholesterol levels, Sort1 and Celsr2 expression levels were negatively correlated. In addition, for liver expression traits in the BXH/wt cross significantly correlated with these three genes, the Sort1 and Celsr2 correlation signatures were most significantly enriched for the GO Biological Process category “cell surface receptor linked signal transduction” (1.4-fold enrichment, p = 1.91 × 10−5, and 1.6-fold enrichment, p = 4.57 × 10−10, for Sort1 and Celsr2, respectively), while the Psrc1 correlation signature was most enriched for the “cell cycle” category (3-fold enrichment, p = 0.00044), suggesting that Sort1 and Celsr2 may be involved in similar biological processes that are distinct from processes involving Psrc1. To further elucidate the involvement of these genes in metabolic phenotypes associated with CAD, we examined Psrc1, Celsr2, and Sort1 in the context of the probabilistic, causal network constructed as described above for the Erbb3/Rps26 example. All three genes not only fell in the same subnetwork, they were all directly connected to the same gene, 2010200O16Rik, demonstrating that these genes are tightly co-regulated, possibly driven by common regulatory factors (Figure 3A). This same subnetwork also included genes like Tgfbr2, Pparg, Lpl, Ppm1l, and Alox5ap, all of which have been previously identified and validated as being associated with traits related to obesity, diabetes, cholesterol levels, and cardiovascular disease [25,31–33]. More generally, Psrc1 and Sort1 participate in a previously defined macrophage-enriched metabolic (MEM) subnetwork validated as causal for obesity-, diabetes-, and atherosclerosis-related traits [34]. In fact, the subnetwork depicted in Figure 3A is composed of 1,346 genes, with 226 of these genes overlapping the set of 1,406 genes composing the MEM subnetwork (82 would have been expected by chance). This 2.76-fold enrichment in this case is highly significant, with a Fisher exact test p = 8.20 × 10−47. Figure 3 Local Networks for PSRC1, CELSR2, and SORT1 Derived from Causal, Probabilistic Whole-Gene Networks in Mouse and Human (A) Mouse network for Psrc1, Celsr2, and Sort1 derived from the liver, adipose, muscle, and brain gene expression data generated from the BXH/wt and BXC mouse crosses. (B) Human network for PSRC1, CELSR2, and SORT1 derived from the HLC and from a previously published adipose and blood tissue cohort [21]. To establish whether PSRC1, CELSR2, and SORT1 are closely connected in human transcriptional networks as they are in mouse, we constructed a probabilistic, causal network from the HLC and from a previously published adipose and blood tissue cohort [21], using previously described methods [25,28–30]. As depicted in Figure 3B, PSRC1, CELSR2, and SORT1 fall in the same subnetwork and are closely connected, as in the mouse network. In addition, the genes comprising this human subnetwork are enriched for genes that fall in the mouse network depicted in Figure 3A (Fisher exact test p = 1.78 × 10−8). Further, the human subnetwork is also enriched for genes falling in the MEM module (Fisher exact test p = 5.03 × 10−8), confirming the association to metabolic phenotypes detected in the mouse network. These data combined suggest that PSRC1, CELSR2, and SORT1 operate in a conserved subnetwork causally associated with cholesterol levels, obesity, diabetes and atherosclerosis. Given the strong association between plasma LDL cholesterol levels and the expression of Psrc1, Sort1, and Celsr2 expression in the BXH/wt cross, we examined a recent GWAS available in the public domain in which LDL cholesterol levels were monitored [35]. A significant association was detected between rs599839 genotypes and LDL cholesterol levels in this human cohort (p = 9.0 × 10−8)[35]. Interestingly, the common allele for rs599839 was associated with higher LDL cholesterol levels [35], consistent with the association of this allele with increased CAD risk. Low SORT1, CELSR2, and PSRC1 expression levels in the HLC are also associated with the rs599839 common allele. However, given low Sort1 and Celsr2 expression levels in the BXH/wt cross are associated with increased LDL cholesterol levels (whereas low Psrc1 expression levels are associated with low LDL cholesterol levels), SORT1 and CELSR2 are the most logical candidate susceptibility gene in the 1p13.3 locus (Figure 2), although direct experimental manipulation of these two genes would be required to provide more direct functional support that these genes are involved in modulating LDL cholesterol levels. The association of this locus with LDL cholesterol levels as well as liver expression levels of SORT1, CELSR2, and PSRC1 were recently reported in multiple independent studies [17,18]. Discussion Previous studies on the genetics of gene expression in humans have focused primarily on lymphoblastoid cell lines or other blood-derived samples [13,14,17]. We have provided a large-scale assessment of the genetics of gene expression in human liver, a metabolically active tissue that is critical to a number of core biological processes and that plays a role in a number of common human diseases. After profiling 427 human liver samples on a comprehensive gene expression microarray and genotyping the DNA from these samples at greater than one million SNPs, we identified a significant genetic signature underlying the expression of more than 6,000 genes, with many of these genes already implicated as causal for a number of different diseases, including heart disease, breast cancer, inflammatory bowel disease, age-related macular degeneration, schizophrenia, and Alzheimer disease. This set of data highlights the utility of monitoring molecular phenotypes that underlie the higher order clinical states of a system. Whereas the eQTL data in the human liver cohort is valuable in its own right, when integrated with other GWAS data and with genetics of gene expression and clinical data in segregating mouse populations, there is the potential to directly identify experimentally supported candidate susceptibility genes for disease. We demonstrated directly how genetics of gene expression data can complement multiple GWAS datasets by highlighting SORT1 and CELSR2 as candidate susceptibility genes for CAD and LDL cholesterol levels at a recently identified locus associated with CAD [16]. In this instance, the association to LDL cholesterol levels is novel and based on publicly available GWAS data and a mouse cross designed specifically to study lipid and other metabolic syndrome traits. In addition to the CAD locus, we highlighted RPS26 as a candidate susceptibility gene for T1D from a novel, highly replicated T1D locus on Chromosome 12q13, which was identified in a separate GWAS [15]. Not only was the expression of this gene in the HLC strongly associated with the T1D SNP at this locus, but it was observed to operate in a part of the molecular network that is significantly enriched for genes associated with T1D (like HLA-DRB1), whereas the gene inferred as the most likely susceptibility gene at that locus (ERBB3) [15] was not supported by any of our experimental data. Recent studies have demonstrated that ribosomal proteins may be involved in auto-immune diseases like systemic lupus erythematosus [36]. In addition, recent work has demonstrated a connection between endoplasmic reticulum (ER) stress in the cytoplasm and diabetes, where protein unfolding in response to ER stress is hypothesized to disrupt processes associated with diabetes [37]. Given RPS26's protein translation role as part of the ribosomal complex on the ER, its association to T1D is particularly intriguing. The unfolded protein response has also been linked to inflammation and oxidative stress [38], hence the putative connection between RPS26 and an auto-immune disease like T1D is worthy of further consideration. Cells with high secretory capacity like pancreatic beta cells are also more likely to be susceptible to ER stress, making the link between RPS26 and T1D even more plausible. In fact, previous work has indicated higher ER stress levels in T1D patients [39]. It is important to note that a lack of association between expression traits in the HLC and disease-associated SNPs is not a valid filter for excluding a gene as a candidate disease susceptibility gene, given that variation in a gene leading to disease may affect protein function and not expression, or it may affect expression in a different tissue or under different environmental conditions. However, the approach of analyzing the genetics of gene expression in human populations does provide a more objective view into the functioning of genes in a given disease-associated region. This view has the potential to lead to higher confidence candidates in the absence of direct functional support for any one gene, which is typically the case in GWASs where the SNPs identified have no known functional role. Given the potential that genetics of gene expression studies have to affect our understanding of common human diseases, generating even larger-scale molecular profiling datasets in segregating populations may provide a path to more rapidly elucidating not only the genetic basis of disease, but the impact the genetic basis of disease has on molecular networks that in turn induce variations in disease associated traits. Materials and Methods HLC and tissue collection. The HLC was assembled from a total of 780 liver samples (1–2 g) that were acquired from Caucasian individuals from three independent liver collections at tissue resource centers at Vanderbilt University, the University of Pittsburgh, and Merck Research Laboratories (Table S1). The Vanderbilt samples (n = 504) included both postmortem tissue and surgical resections from organ donors and were obtained from the Nashville Regional Organ Procurement Agency (Nashville, Tennessee), the National Disease Research Interchange (Philadelphia, Pennsylvania), and the Cooperative Human Tissue Network (University of Pennsylvania, Ohio State University, and University of Alabama at Birmingham). The Pittsburgh samples were normal postmortem human liver and were obtained through the Liver Tissue Procurement and Distribution System (Dr. Stephen Strom, University of Pittsburgh, Pittsburgh, Pennsylvania). The University of Pittsburgh samples (n = 211) were all postmortem, as were the Merck samples (n = 65), which collected by the Drug Metabolism Department and reported previously [40]. All samples were stored frozen at −80 °C from collection until processing for RNA and DNA; some samples had been stored for over a decade before being processed for this study. Demographic data varied across centers for these samples and were missing in many cases. In cases where age, sex, or ethnicity data were not available in the patient records, we imputed it from the gene expression and/or genotype data (see below). Of the 780 samples collected, high-quality DNA was isolated on 548 samples, and 517 of these were successfully genotyped on the Affymetrix genotyping platform (see Methods below). Of the 517 successfully genotyped samples, high-quality RNA was isolated and successfully profiled on 427 samples. This set of 427 genotyped and expression profiled samples comprised the HLC. Table S1 gives a summary of the demographics and other annotations on the 427 individuals that were successfully genotyped and expression profiled. All counts and descriptive statistics include the imputed data. All samples and patient data were handled in accordance with the policies and procedures of the participating organizations. Mouse crosses and tissue collection. C57BL/6J (B6) mice were intercrossed with C3H/HeJ (C3H) mice to generate 321 F2 progeny (161 females, 160 males) for the BXH wild type (BXH/wt). C57BL/6J (B6) mice were intercrossed with Castaneus (CAST) mice to generate 442 F2 progeny (276 females, 166 males) for the BXC cross. All mice were maintained on a 12 h light–12 h dark cycle and fed ad libitum. BXH mice were fed Purina Chow (Ralston-Purina) containing 4% fat until 8 wk of age. From that time until the mice were killed at 20 wk, mice were fed a western diet (Teklad 88137, Harlan Teklad) containing 42% fat and 0.15% cholesterol. BXC mice were fed Purina Chow until 10 wk of age, and then fed western diet (Teklad 88137, Harlan Teklad) for the subsequent 8 wk. Mice were fasted overnight before they were killed. Their livers were collected, flash frozen in liquid nitrogen, and stored in −80 °C prior to RNA isolation. The BXH cross on an ApoE null background (BXH/apoE) was previously described [41]. Briefly, C57BL/6J ApoE null (B6.ApoE–/–) were purchased from Jackson Laboratory. C3H/HeJ ApoE null (C3H.Apo E–/–) were generated by backcrossing B6.ApoE–/– to C3H for ten generations. F1 mice were generated from reciprocal intercrossing between B6.ApoE–/– and C3H.ApoE–/–, and F2 mice were subsequently bred by intercrossing F1 mice. A total of 334 (169 female, 165 male) were bred, and all were fed Purina Chow containing 4% fat until 8 wk of age, and then transferred to western diet containing 42% fat and 0.15% cholesterol for 16 wk. Mice were killed at 24 wk, and liver, white adipose tissue, and whole brains were immediately collected and flash-frozen in liquid nitrogen. All procedures of housing and treatment of animals were performed in accordance with Institutional Animal Care and Use Committee regulations. Microarray design, RNA sample preparation, hybridization, and expression analysis. Array design and preparation of labeled cDNA and hybridizations to microarrays for the human liver cohort. RNA preparation and array hybridizations were performed at Rosetta Inpharmatics. The custom ink-jet microarrays used in this study were manufactured by Agilent Technologies and consisted of 4,720 control probes and 39,280 noncontrol oligonucleotides extracted from mouse Unigene clusters and combined with RefSeq sequences and RIKEN full-length cDNA clones (Table S4). Liver samples extracted from the 427 Caucasian individuals were homogenized, and total RNA extracted using TRIzol reagent (Invitrogen) according to manufacturer's protocol. Three micrograms of total RNA was reverse transcribed and labeled with either Cy3 or Cy5 fluorochrome. Purified Cy3 or Cy5 complementary RNA was hybridized to at least two single microarrays with fluor reversal for 24 h in a hybridization chamber, washed, and scanned using a laser confocal scanner. Arrays were quantified on the basis of spot intensity relative to background, adjusted for experimental variation between arrays using average intensity over multiple channels, and fitted to an error model to determine significance (type I error), as previously described [42]. Gene expression is reported as the mean-log ratio relative to the pool derived from 192 liver samples selected for sex balance from the Vanderbilt and Pittsburgh samples, because the RNA from the Merck samples had been amplified at an earlier date. The error model used to assess whether a given gene is significantly differentially expressed in a single sample relative to a pool composed of a randomly selected subset of samples has been extensively described and tested in a number of publications [42–44]. The age, sex, race, center, alcohol use, drug use, and steatosis variables presented in Table S1 were tested for association to the gene expression traits. Only age, sex, race, and center were significantly associated with the expression traits beyond what would be expected by chance. As a result, all gene expression traits were adjusted for these covariates. The lack of association between the expression traits and alcohol use, drug use, and steatosis was somewhat surprising, but may be due to the sparseness of these data, resulting in a lack of power to detect significant associations. Array design and preparation of labeled cDNA and hybridizations to microarrays for the mouse liver and adipose tissue samples. RNA preparation and array hybridizations were again performed at Rosetta Inpharmatics. The custom ink-jet microarrays used in the BXH/wt, BXH/apoE, and BXC crosses were manufactured by Agilent Technologies. The array used for the BXH/apoE and BXH/wt samples consisted of 2,186 control probes and 23,574 noncontrol oligonucleotides extracted from mouse Unigene clusters and combined with RefSeq sequences and RIKEN full-length cDNA clones (Table S5). The array used for the BXC cross consisted of 39,280 noncontrol oligonuceotides again extracted from the mouse Unigene clusters and combined with RefSeq sequences and RIKEN full-length cDNA clones (Table S6). Mouse adipose and liver tissues from all of the crosses were homogenized, and total RNA extracted using Trizol reagent (Invitrogen) according to manufacturer's protocol. Three micrograms of total RNA was reverse transcribed and labeled with either Cy3 or Cy5 fluorochrome. Labeled complementary RNA (cRNA) from each F2 animal was hybridized against a cross-specific pool of labeled cRNAs constructed from equal aliquots of RNA from 150 F2 animals and parental mouse strains for each of the three tissues for each cross. The hybridizations for the BXH/apoE cross were performed in fluor reversal for 24 h in a hybridization chamber, washed, and scanned using a confocal laser scanner. The hybridizations for the BXH/wt and BXC crosses were performed to single arrays (individuals F2 samples labeled with Cy5 and reference pools labeled with Cy3 fluorochromes) for 24 h in a hybridization chamber, washed, and again scanned using a confocal laser scanner. Arrays were quantified on the basis of spot intensity relative to background, adjusted for experimental variation between arrays using average intensity over multiple channels, and fitted to a previously described error model to determine significance (type I error) [42]. Gene expression measures are reported as the ratio of the mean log10 intensity (mlratio). DNA processing. DNA isolation. DNA isolation was performed at Rosetta Inpharmatics. DNeasy tissue kits from QIAGEN were used to carry out all DNA extractions. For each liver sample, 20–30 mg of liver was placed in a 1.5-ml microcentrifuge tube along with 80 μl buffer ATL and 20 μl proteinase K. The contents of each tube were then mixed thoroughly by vortexing, followed by incubation at 55 °C until the tissue was completely lysed. Transcriptionally active tissues such as liver and kidney contain high levels of RNA, which will co-purify with genomic DNA. Because RNA-free genomic DNA was required for processing, 4 μl RNase A (100 mg/ml) was added and mixed by vortexing, followed by incubation for 2 min at room temperature before continuing. Samples were then vortexed and 200 μl buffer AL was added to the sample and mixed thoroughly. After 10 min incubation at 70 °C, 200 μl ethanol (96%–100%) was then added and mixed again. The mixture was placed into the DNeasy Mini column and centrifuged at 6,000g (8,000 rpm) for 1 min. The DNeasy Mini spin column was then placed in a new 2-ml collection tube, and 500 μl buffer AW1 was added, followed by placement in a centrifuge for 1 min at 6,000g (8,000 rpm). The DNeasy Mini spin column was then placed in a new 2-ml collection tube again, and 500 μl buffer AW2 was added and centrifuged for 3 min at 20,000g (14,000 rpm) to dry the DNeasy membrane. Then the DNeasy Mini spin column was placed in a clean 1.5-ml or 2-ml microcentrifuge tube and 200 μl buffer AE was pipetted directly onto the DNeasy membrane. This was incubated at room temperature for 1 min and then centrifuged for 1 min at 6,000g (8,000 rpm) to elute. Two 200-μl elutions were performed followed by ethanol/sodium acetate precipitation and resuspension of the resultant pellet with TE buffer. Genotyping data from the Affymetrix 500K panel. SNP genotyping was performed with the commercial release of the Affymetrix 500K genotyping array. The genotyping was carried out at the Perlegen genotyping facility in Mountain View, California. Genotyping was attempted on 548 samples. 18 samples were unable to be genotyped because of poor DNA quality, and an additional 13 samples were removed after genotyping because their overall call rate did not exceed the 90% cutoff we required. We then applied SNP-wise quality checks on the 517 samples that were successfully genotyped. The Affymetrix 500K array consisted of 500,568 SNPs in total, 429,545 SNPs provided quality data from the genotyping assay, and we rejected those SNPs with a call rate 4%, and there was not significant deviation from Hardy-Weinberg equilibrium at the 0.0001 significance level. The sample set for analysis was restricted to the 427 HLC samples that had both genotype and gene expression data available, passed the criteria outlined above and those that were identified as Caucasian, or imputed to be Caucasian when data was missing (see below). Data preprocessing. Sex confirmation. Sex identifiers were available for most of the liver samples obtained from the three study centers. We independently confirmed the sex of each individual providing a liver sample by two methods. First, we looked for expression of Y-specific genes in the liver gene expression based on three probes representing three distinct transcripts. Second, we scored heterozygosity of X-chromosome markers. We excluded any individual for which there was a discrepancy in any of the three measures of sex in order to ensure a coherent data set for analysis and that we had excluded as many potential cases of annotation or sample-handling errors as possible. For samples where sex was not noted in the records, we imputed the sex call if both the genotype and gene-expression data were concordant. Ethnicity. Ethnicities were confirmed or imputed using STRUCTURE [45]. A panel of 106 autosomal markers was randomly selected from around the genome to be unlinked and ancestry informative. Markers were selected from the HapMap data [46] that were present on the Affy 500K panel such that the minor allele frequency was >0.05 and the absolute allele frequency difference in the Caucasians and African Americans ∼0.5, with average minor allele frequency 0.5 (standard deviation = 12). Several K were tested (K = 1–6) with burn-in 100,000 and 100,000 reps of MCMC before any information was collected. In all cases, the greatest support was for K = 2. Admixture was detected for some individuals in some runs and some individuals were reclassified. For those unknown and reclassified, population reassignment was made if the probability of group membership was >0.9 for that individual. This resulted in 469 individuals assigned to the Caucasian group, 28 individuals assigned to the African descent or African American group, and 18 individuals assigned as “unknown”. The data set for further analysis was restricted to Caucasian samples. Age. Ages were imputed using the Elastic Net method [47]. This method performs model selection and parameter estimation in a manner that is a combination of ridge-regression and the lasso. The prediction method is also explained in [47]. For computational reasons, λ was set to zero, in which case the Elastic Net method reduces to the lasso method. For most applications, experience demonstrates that the optimal value for λ is zero or quite near zero. Ages were imputed using separate models for each data source, due to evidence of a source effect, and each sex separately. In cases where the sex was missing or the reported sex was different from the sex implied by the expression data, the sex implied by the expression data was used. This was done so that in the case the annotation data and expression data were mismatched, the imputed age would correspond to the data used to predict it. The 5,000 genes with the highest correlation to age were used as potential regressors. Cross-validation was used to select the number of steps in the model selection procedure. The number of predictors in the model was between 67 and 76 for the four different models. The percentage of variation explained in the training set is quite high (97%–99%) for three of the models. For the fourth, the model for Vanderbilt females, the percentage of variation explained was slightly lower, 0.92. This is a vast improvement over more naïve imputation methods that are used when adjusting for covariates with missing data, where mean values of the nonmissing data are used to fill in the missing values. Very few of the predictors we constructed were common between the different models. Given the number of predictors with high correlation to age, this is not surprising. Nonetheless, within a given data source (i.e., Pittsburgh or Vanderbilt samples), the male model is a reasonable predictor for the ages of the females and vice-versa. This same trend did not hold for predicting the ages of same-sex individuals across data sources. Statistical and data visualization methods. Expression trait processing. Expression traits were adjusted for age, sex, and medical center. Residuals were computed using rlm function from R statistical package (M-estimation with Tukey's bisquare weights). In examining the distributions of the mean log ratio measures for each expression trait in the HLC set, we noted a high rate of outliers. As a result, we used robust residuals and nonparametric tests to carry out the association analyses in the HLC. For each expression trait, residual values deviating from the median by more than three robust standard deviations were filtered out as outliers. Genome-wide eQTL association analysis. The Kruskal-Wallis test was used to determine association between adjusted expression traits and genotypes. We chose this nonparametric method because of its robust nature to underlying genetic model and trait distribution. p-Values were computed using nag_mann_whitney (for loci with two observed genotypes) and nag_kruskal_wallis_test (for loci with three observed genotypes) routines from NAG C library (http://www.nag.co.uk). We used FDR for multiple-test correction. FDR was estimated as the ratio of the average number of eQTLs found in datasets with randomized sample labels to the number of eQTLs identified in the original data set. Since the number of tests was large (∼1,010), we found the empirical null distribution was very stable and three permutation runs were sufficient for convergence to estimate FDR. FDR computation was performed separately for cis (<1 Mb probe to SNP distance) and trans associations resulting in nominal p-value cutoffs of 5.0 × 10−5 and 1.0 × 10−8 for cis and trans eQTLs, respectively. Targeted set association analysis. The 3,346 SNPs identified in the first round of analysis as associating with expression traits in cis at an FDR < 0.1 were picked for a second round of analysis. To assess the significance of the resulting set of expression traits detected as associated with this set of SNPs, sets of randomly selected SNPs of size 3,346 with MAF distributions identical to the original set were generated. All sets of SNPS were then analyzed using the same method described above for genome-wide associations. Identifying differentially expressed genes. To assess whether a gene in a given sample was differentially expressed, we used a previously described and validated error model for testing whether the mean log ratio of the intensity measures between the experiment and reference channels was significantly different from zero [42,43,48]. Based on this error model we obtained p-values for each of the individual gene expression measures in each sample as previously described [33]. We then computed the standard deviation of –log10 of the p-value for each gene expression measure over all samples profiled for a given tissue, and then rank ordered all of the genes profiled in each tissue based on this standard deviation value (rank ordered in descending order). Genes that fall at the top of this rank ordered list can be considered as the most differentially expressed or variable genes in the study. We have previously shown that this type of ordering approach well captures the most active genes in a set of samples [33]. For demonstrating the number of genome-wide significant eQTLs and eSNPs as a function of differential gene expression, we binned the expression traits into quartiles (Q1-Q4) based on the rank-ordered gene list, with each bin containing 10,025 genes and the bins increasing in significance with respect to differential expression, from Q1 to Q4. Visualization of networks. Networks were visualized using the Target Gene Information (TGI) Network Analysis and Visualization (NAV) desktop application developed at Rosetta Inpharmatics. This tool enables rapid, real-time, graphical analysis of pathway network models built from a comprehensive and fully integrated set of public and proprietary interaction databases available through a back-end central database, described in detail in a separate report. Additionally, the TGI NAV tool supports experimentally generated systems biology data such as the statistical associations and causal relationships described here. TGI NAV enables integration and visualization of orthogonal data sets using network models as a framework and facilitates dissection of networks into smaller, functionally significant subnetworks amenable to biological interpretation. To construct the local networks for H2-Eb1, Erbb3, and Rps26, the whole-gene probabilistic causal networks were loaded into the database and the TGI NAV tool was used to extract all edges from this network involving the central gene of interest. In the case of the Erbb3 network, the local network was expanded by extracting all additional edges involving any genes directly connected to Erbb3. Note that while the underlying networks describe causal relationships between transcripts, TGI NAV was used to translate this network into the space of genes using an integrated mapping database that clusters transcripts into gene models utilizing their genomic coordinates. As a result, multiple causal relationships between gene pairs can be observed in cases where multiple transcripts for a single gene were profiled. Visualization properties of nodes (e.g., color) are specified in TGI NAV either for individual nodes, or in a data-driven manner by associating attributes, such as KEGG pathway membership, with groups of nodes and mapping visualization properties to these attributes. Supporting Information Figure S1 Atlas of Gene Expression for Rps26 and Erbb3 For all panels, the horizontal bar for each row represents the mean expression value and the horizontal line indicating the standard deviation. The red arrow off to the left highlights the pancreas tissue. (A) Expression levels of Rps26 in 60 murine tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances are given along the x-axis. (B) Expression levels of Erbb3 in 60 murine tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances are given along the x-axis. (C) Expression levels of Rps26 in 46 monkey tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances are given along the x-axis. (D) Expression levels of Erbb3 in 46 monkey tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances are given along the x-axis. (E) Expression levels of RPS26 in 50 human tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances as determined by each of six individual reporters on the microarray that target RPS26 are given along the x-axis. (F) Expression levels of Erbb3 in 50 human tissues and cell lines. The tissues and cell lines are given along the y-axis, and the mean relative transcript abundances are given along the x-axis. (441 KB PDF) Click here for additional data file. Table S1 Population Demographics of the HLC (81 KB XLS) Click here for additional data file. Table S2 Association Results for HLC Expression and Genotyping Data (1.64 MB XLS) Click here for additional data file. Table S3 Expression Traits Corresponding to Genes Associated with Human Diseases Are under Significant Genetic Control in the HLC (172 KB DOC) Click here for additional data file. Table S4 Genes Represented on the HLC Microarray Described in the Main Text (7.97 MB XLS) Click here for additional data file. Table S5 Genes Represented on the BXH/apoE Microarray Described in the Main Text (9.33 MB XLS) Click here for additional data file. Table S6 Genes Represented on the BXH/wt and BXC Microarray Described in the Main Text (6.58 MB XLS) Click here for additional data file. Accession Numbers All microarray data associated with the HLC have been deposited into the Gene Expression Ominbus database under accession number GSE9588.
                Bookmark

                Author and article information

                Contributors
                On behalf of : on behalf of Procardis consortium
                Journal
                9216904
                2419
                Nat Genet
                Nature genetics
                1061-4036
                1546-1718
                25 December 2010
                17 January 2010
                February 2010
                11 January 2011
                : 42
                : 2
                : 105-116
                Affiliations
                [1 ]Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts 02118, USA
                [2 ]National Heart, Lung, and Blood Institute’s Framingham Heart Study, Framingham, Massachusetts 01702, USA
                [3 ]MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke's Hospital, Cambridge CB2 0QQ, UK
                [4 ]Oxford Centre for Diabetes, Endocrinology and Metabolism, University of Oxford, Oxford OX3 7LJ, UK
                [5 ]Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
                [6 ]Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts 02142, USA
                [7 ]Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
                [8 ]Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
                [9 ]Twin Research & Genetic Epidemiology Department, King’s College London, St Thomas' Hospital Campus, Lambeth Palace Rd, London SE1 7EH, UK
                [10 ]Center for Statistical Genetics, Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan 48109, USA
                [11 ]Metabolic Disease Group, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
                [12 ]Cardiovascular Health Research Unit and Department of Medicine, University of Washington, Seattle, Washington, USA
                [13 ]CNRS-UMR8090, Pasteur Institute, Lille 2-Droit et Santé University, F-59000 Lille, France
                [14 ]Department of Medical Genetics, University of Lausanne, 1005 Lausanne, Switzerland
                [15 ]University Institute of Social and Preventative Medicine, Centre Hospitalier Universitaire Vaudois (CHUV) and University of Lausanne, 1005 Lausanne, Switzerland
                [16 ]Swiss Institute of Bioinformatics, Switzerland
                [17 ]Department of Epidemiology and Public Health, Imperial College of London, Faculty of Medicine, Norfolk Place, London W2 1PG, UK
                [18 ]Boston University Data Coordinating Center, Boston, Massachusetts 02118, USA
                [19 ]deCODE Genetics, 101 Reykjavik, Iceland
                [20 ]Department of Human Genetics, Leiden University Medical Centre, 2300 RC Leiden, The Netherlands
                [21 ]Institute of Epidemiology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, 85764 Neuherberg, Germany
                [22 ]Department of Epidemiology, Erasmus MC Rotterdam, 3000 CA, The Netherlands
                [23 ]Department of Biological Psychology, VU, Van der Boechorststraat 1, 1081 BT Amsterdam, The Netherlands
                [24 ]Centre for Population Health Sciences, University of Edinburgh, Edinburgh EH8 9AG, UK
                [25 ]MRC Human Genetics Unit, IGMM, Edinburgh EH4 2XU, UK
                [26 ]Division of Genetics, R&D, Glaxo SmithKline, King of Prussia, Pennsylvania 19406, USA
                [27 ]Department of Cardiovascular Medicine, University of Oxford, Oxford OX3 9DU, UK
                [28 ]Genetics of Complex Traits, Institute of Biomedical and Clinical Sciences, Peninsula College of Medicine and Dentistry, University of Exeter EX1 2LU, UK
                [29 ]Laboratory of Clinical Investigation, National Institute of Aging, Baltimore, Maryland 21250, USA
                [30 ]Unit for Child and Adolescent Health and Welfare, National Institute for Health and Welfare, Biocenter Oulu, University of Oulu, 90014 Oulu, Finland
                [31 ]Hagedorn Research Institute, 2820 Gentofte, Denmark
                [32 ]Department of Medicine & Therapeutics, Level 7, Ninewells Hospital & Medical School, Dundee DD1 9SY, UK
                [33 ]Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21287, USA
                [34 ]Department of Nutrition - Dietetics, Harokopio University, 17671 Athens, Greece
                [35 ]General Medicine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
                [36 ]Department of Epidemiology and Public Health, University College London, UK
                [37 ]Depts. of Nutrition and Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA
                [38 ]MRC Centre for Causal Analyses in Translational Epidemiology, University of Bristol, Bristol BS8 2PR, UK
                [39 ]Fundación para la Investigación Biomédica del Hospital Clínico San Carlos, Madrid, Spain
                [40 ]Departments of Medicine and Human Genetics, McGill University, Montreal, Canada
                [41 ]Genome Quebec Innovation Centre, Montreal H3A 1A4, Canada
                [42 ]Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
                [43 ]Department of Public Health and Caring Sciences, Uppsala University, Uppsala, Sweden
                [44 ]Division of Statistical Genomics, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, USA
                [45 ]Division of Endocrinology, Diabetes and Nutrition, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
                [46 ]INSERM U859, Universite de Lille-Nord de France, F-59000 Lille, France
                [47 ]Genome Technology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
                [48 ]The Broad Institute, Cambridge, Massachusetts 02141, USA
                [49 ]Leiden Genome Technology Center, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
                [50 ]INSERM U780-IFR69, Paris Sud University, F-94807 Villejuif, France
                [51 ]The Heart Research Institute, Sydney, New South Wales, Australia
                [52 ]PathWest Laboratory of Western Australia, Department of Molecular Genetics, J Block, QEII Medical Centre, NEDLANDS WA 6009, Australia
                [53 ]School of Surgery and Pathology, University of Western Australia, Nedlands WA 6009, Australia
                [54 ]Department of Social Medicine, University of Bristol, Bristol BS8 2PR, UK
                [55 ]Landspitali University Hospital, 101 Reykjavik, Iceland
                [56 ]Icelandic Heart Association, 201 Kopavogur, Iceland
                [57 ]The Human Genetics Center and Institute of Molecular Medicine, University of Texas Health Science Center, Houston, Texas 77030, USA
                [58 ]Steno Diabetes Center, DK-2820 Gentofte, Copenhagen, Denmark
                [59 ]Faculty of Health Science, University of Aarhus, Aarhus DK-8000, Denmark
                [60 ]Department of Medicine, University of Leipzig, Liebigstr. 18, 04103 Leipzig, Germany
                [61 ]Endocrinology-Diabetology Unit, Corbeil-Essonnes Hospital, Essonnes, F-91108 France
                [62 ]Medical Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, California, USA
                [63 ]Clinical Trial Service Unit and Epidemiological Studies Unit, University of Oxford, Oxford OX3 7LF, UK
                [64 ]Centre for Genetic Epidemiology and Biostatistics, University of Western Australia, Perth, Australia
                [65 ]Istituto di Neurogenetica e Neurofarmacologia (INN), Consiglio Nazionale delle Ricerche, c/o Cittadella Universitaria di Monserrato, Monserrato, Cagliari 09042, Italy
                [66 ]Western Australian Sleep Disorders Research Institute, Queen Elizabeth Medical Centre II, Perth, Australia
                [67 ]Department of Endocrinology, Diabetes and Nutrition, Charite-Universitaetsmedizin Berlin, Berlin, Germany
                [68 ]Department of Clinical Nutrition, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
                [69 ]Division of Endocrinology, Diabetes, and Hypertension, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, USA
                [70 ]Department of Human Genetics, Leiden University Medical Centre, 2300 RC Leiden, The Netherlands
                [71 ]Department of Cardiovascular Research, Istituto di Ricerche Farmacologiche 'Mario Negri', Milan, Italy
                [72 ]U557 Institut National de la Santé et de la Recherche Médicale, U1125 Institut National de la Recherche Agronomique, Université Paris 13, 74 rue Marcel Cachin, 93017 Bobigny Cedex, France
                [73 ]Department of Medicine III, Division Prevention and Care of Diabetes, University of Dresden, 01307 Dresden
                [74 ]Center for Human Nutrition, University of Texas Southwestern Medical Center, Dallas, Texas, USA
                [75 ]Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, S-751 85 Uppsala, Sweden
                [76 ]CHU de Poitiers, Endocrinologie Diabetologie, CIC INSERM 0802, INSERM U927, Université de Poitiers, UFR, Médecine Pharmacie, Poitiers, France
                [77 ]Department of Public Health & Clinical Medicine, Section for Nutritional Research, Umeå University, Umeå, Sweden
                [78 ]Department of Clinical Sciences, Obstetrics and Gynecology, University of Oulu, Box 5000, Fin-90014 University of Oulu, Finland
                [79 ]Centre National de Génotypage/IG/CEA, 2 rue Gaston Crémieux CP 5721, 91057 Evry Cedex, France
                [80 ]U872 Institut National de la Santé et de la Recherche Médicale, Faculté de Médecine Paris Descartes, 15 rue de l’Ecole de Médecine, 75270 Paris Cedex, France
                [81 ]Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
                [82 ]Institute of Genetic Medicine, European Academy Bozen/Bolzano (EURAC), Viale Druso 1, 39100 Bolzano, Italy, Affiliated Institute of the University Lübeck, Germany
                [83 ]Department of Pulmonary Physiology, Sir Charles Gairdner Hospital, Perth, Australia
                [84 ]Busselton Population Medical Research Foundation, Sir Charles Gairdner Hospital, Perth, Australia
                [85 ]Heart Institute of Western Australia, Sir Charles Gairdner Hospital, Nedlands WA 6009, Australia
                [86 ]School of Medicine and Pharmacology, University of Western Australia, Nedlands, WA 6009, Australia
                [87 ]Folkhalsan Research Centre, Helsinki, Finland
                [88 ]Malmska Municipal Health Care Center and Hospital, Jakobstad, Finland
                [89 ]Nuffield Department of Surgery, University of Oxford, Oxford OX3 9DU, UK
                [90 ]Research Centre for Prevention and Health, Glostrup University Hospital, Glostrup, Denmark
                [91 ]Faculty of Health Science, University of Copenhagen, Copenhagen, Denmark
                [92 ]National Institute for Health and Welfare, Unit of Population Studies, Turku, Finland
                [93 ]Institute of Health Sciences and Biocenter Oulu, Box 5000, Fin-90014 University of Oulu, Finland
                [94 ]Department of Public Health, Faculty of Medicine, P.O. Box 41 (Mannerheimintie 172), University of Helsinki, 00014 Helsinki, Finland
                [95 ]National Institute for Health and Welfare, Unit for Child and Adolescent Mental Health, Helsinki, Finland
                [96 ]Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland
                [97 ]Department of Internal Medicine and Biocenter Oulu, Oulu, Finland
                [98 ]Diabetes Genetics, Institute of Biomedical and Clinical Science, Peninsula College of Medicine and Dentistry, University of Exeter, Exeter EX2 5DW, UK
                [99 ]National Institute for Health and Welfare, Unit of Living Conditions, Health and Wellbeing, Helsinki, Finland
                [100 ]Interdisciplinary Centre for Clinical Research, University of Leipzig, Inselstr. 22, 04103 Leipzig, Germany
                [101 ]The Danish Twin Registry, Epidemiology, Institute of Public Health, University of Southern Denmark, J.B. Winsløws Vej 9B, 5000 Odense, Denmark
                [102 ]Department of Clinical Sciences, Diabetes and Endocrinology, Lund University, University Hospital Malmo, Malmo, Sweden
                [103 ]Gladstone Institute of Cardiovascular Disease, University of California, San Francisco, California, USA
                [104 ]Diabetes Research Center (Diabetes Unit), Massachusetts General Hospital, Boston, Massachusetts 02114, USA
                [105 ]Department of Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA
                [106 ]Division of Cardiology, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
                [107 ]Oxford NIHR Biomedical Research Centre, Churchill Hospital, Oxford OX3 7LJ, UK
                [108 ]Department of Clinical Genetics, Erasmus MC Rotterdam, 3000 CA, The Netherlands
                [109 ]Biomedical Research Institute, University of Dundee, Ninewells Hospital & Medical School, Dundee DD1 9SY, UK
                [110 ]Department of Geriatric Medicine and Metabolic Disease, Second University of Naples, Naples, Italy.
                [111 ]National Institute for Health and Welfare, Unit of Public Health Genomics, Helsinki, Finland
                [112 ]Department of Medical Genetics, University of Helsinki, Helsinki, Finland
                [113 ]Department of Medical Statistics, Epidemiology and Medical Informatics, Andrija Stampar School of Public Health, Medical School, University of Zagreb, Rockefellerova 4, 10000 Zagreb, Croatia
                [114 ]Department of Clinical Genetics, VUMC, Van der Boechorststraat 7, 1081 BT Amsterdam, The Netherlands
                [115 ]Department of Obstetrics and Gynaecology, Oulu University Hospital, Oulu, Finland
                [116 ]Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, Washington, USA
                [117 ]Group Health Center for Health Studies, Seattle, Washington, USA
                [118 ]Institute of Biometrics and Epidemiology, German Diabetes Centre, Leibniz Centre at Heinrich Heine University Düsseldorf, Düsseldorf, Germany
                [119 ]Department of Biostatistics, University of Washington, Seattle, Washington 98195, USA
                [120 ]Department of Internal Medicine, Erasmus MC Rotterdam, 3000 CA, The Netherlands
                [121 ]Department of Medicine/Metabolic Diseases, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
                [122 ]Department of Public Health & Clinical Medicine, Section for Family Medicine, Umeå University Hospital, Umeå, Sweden
                [123 ]School of Public Health, Department of General Practice, University of Aarhus, Aarhus DK-8000, Denmark
                [124 ]Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, Cambridge, UK
                [125 ]MRC Epidemiology Resource Centre, University of Southampton, Southampton General Hospital, Southampton SO16 6YD, UK
                [126 ]Department of Epidemiology, University of Texas, M.D. Anderson Cancer Center, Houston, Texas, 77030, USA
                [127 ]Leibniz-Institut für Arterioskleroseforschung an der Universität Münster,Münster, Germany
                [128 ]Atherosclerosis Research Unit, Department of Medicine, Karolinska Institutet, Stockholm, Sweden
                [129 ]Laboratory of Neurogenetics, National Institute on Aging, Bethesda, Maryland 20892, USA
                [130 ]Department of Epidemiology, University of Washington, Seattle, Washington 98195, USA
                [131 ]Seattle Epidemiologic Research and Information Center, Department of Veterans Affairs Office of Research and Development, Seattle, Washington, USA
                [132 ]Department of Medical Sciences, Uppsala University, Uppsala, Sweden
                [133 ]Medstar Research Institute, Baltimore, Maryland 21250, USA
                [134 ]Clinical Research Branch, National Institute on Aging, Baltimore, Maryland 21250, USA
                [135 ]Institut interrégional pour la santé (IRSA), F-37521 La Riche, France
                [136 ]Coordination Centre for Clinical Trials, University of Leipzig, Härtelstr. 16-18, 04103 Leipzig, Germany
                [137 ]Department of Medicine, Helsinki University Hospital, University of Helsinki, Helsinki, Finland
                [138 ]Department of Internal Medicine, Leiden University Medical Centre, 2300 RC Leiden, The Netherlands
                [139 ]Research Unit, Cardiovascular Genetics, Nancy University Henri Poincaré, Nancy, France
                [140 ]EMGO Institute/Department of Psychiatry, VU University Medical Center, Amsterdam, The Netherlands
                [141 ]Department of Internal Medicine, Centre Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland
                [142 ]Genomic Medicine, Imperial College London, Hammersmith Hospital, W12 0NN, London, UK
                [143 ]Epidemiology & Public Health, Queen's University Belfast, Belfast BT12 6BJ, UK
                [144 ]Medical Products Agency, Uppsala, Sweden
                [146 ]National Institute for Health and Welfare, Unit of Chronic Disease Epidemiology and Prevention, Helsinki, Finland
                [147 ]Departments of Nutrition and Epidemiology, Harvard School of Public Health, Boston, Massachusetts, USA
                [148 ]Channing Laboratory, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts, USA
                [149 ]Genetic Epidemiology & Clinical Research Group, Department of Public Health & Clinical Medicine, Section for Medicine, Umeå University Hospital, Umeå, Sweden
                [150 ]London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
                [151 ]Department of Medicine, School of Medicine, Johns Hopkins University, Baltimore, Maryland 21287, USA
                [152 ]The Welch Center for Prevention, Epidemiology, and Clinical Research, School of Medicine and Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21287, USA
                [153 ]Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55454, USA
                [154 ]Department of Endocrinology and Diabetes, Norfolk and Norwich University Hospital NHS Trust, Norwich, NR1 7UY, UK
                [155 ]Department of Medicine, University of Kuopio and Kuopio University Hospital, Kuopio 70210, Finland
                [156 ]Faculty of Health Science, University of Southern Denmark, Odense, Denmark
                [157 ]Institute of Biomedical Science, Faculty of Health Science, University of Copenhagen, Denmark
                [158 ]Department of Neurology, General Central Hospital, 39100 Bolzano, Italy
                [159 ]Department of Neurology, University of Lübeck, Ratzeburger Allee 160, 23538 Lübeck, Germany
                [160 ]Institute of Medical Informatics, Biometry and Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany
                [161 ]Klinikum Grosshadern, Munich, Germany
                [162 ]School of Medicine, University of Split, Soltanska 2, 21000 Split, Croatia
                [163 ]Gen-Info Ltd, Ruzmarinka 17, 10000 Zagreb, Croatia
                [164 ]Department of Physiology and Biophysics, Keck School of Medicine, University of Southern California, Los Angeles, California 90033, USA
                [165 ]Department of Medicine, Division of Endocrinology, Keck School of Medicine, University of Southern California, Los Angeles, California 90033, USA
                [166 ]Department of Genetics, University of North Carolina, Chapel Hill, North Carolina 27599, USA
                [167 ]National Institute for Health and Welfare, Unit of Diabetes Prevention, Helsinki, Finland
                [168 ]Departments of Medicine and Epidemiology, University of Washington, Seattle, Washington, USA
                [169 ]Longitudinal Studies Section, Clinical Research Branch, National Institute on Aging, NIH, Baltimore, Maryland, USA
                [170 ]Faculty of Medicine, University of Iceland, 101 Reykjavík, Iceland
                [171 ]Lab of Cardiovascular Sciences, National Institute on Aging, NIH, Baltimore, Maryland, USA
                [172 ]Department of Clinical Sciences/Clinical Chemistry, University of Oulu, Box 5000, Fin-90014 University of Oulu, Finland
                [173 ]National Institute of Health and Welfare, Aapistie 1, P.O. Box 310, Fin-90101 Oulu, Finland
                [174 ]Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, 90033, USA
                Author notes
                Corresponding authors: Michael Boehnke, Department of Biostatistics and Center for Statistical Genetics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109 – USA, Tel. +1 734 936 1001, Fax. +1 734 615 8322, boehnke@ 123456umich.edu , Mark I. McCarthy, Oxford Centre for Diabetes, Endocrinology and Metabolism, Churchill Hospital, Old Road, Headington, Oxford OX3 7LJ – UK, Tel. +44 (0) 1865 857298, Fax. +44 (0) 1865 857299, mark.mccarthy@ 123456drl.ox.ac.uk , Jose C. Florez, Diabetes Research Center (Diabetes Unit), and Center for Human Genetic Research, Simches Research Building – CPZN 5.250, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114 – USA, Tel. +1 617 643 3308, Fax. +1 617 643 6630, jcflorez@ 123456partners.org , Inês Barroso, Metabolic Disease Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK, Tel. +44 (0) 1223 495341, Fax. +44 (0) 1223 494919, ib1@ 123456sanger.ac.uk
                [*]

                These authors contributed equally

                [145]

                See appendix for full list of authors

                Article
                nihpa259059
                10.1038/ng.520
                3018764
                20081858
                9170535e-2000-4ee6-83da-3ca57d24309b

                Users may view, print, copy, download and text and data- mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Funding
                Funded by: National Institute of Diabetes and Digestive and Kidney Diseases : NIDDK
                Award ID: R01 DK078616-01A1 ||DK
                Funded by: National Institute of Diabetes and Digestive and Kidney Diseases : NIDDK
                Award ID: P30 DK040561-14 ||DK
                Categories
                Article

                Genetics
                Genetics

                Comments

                Comment on this article