Introduction Diabetic kidney disease, or diabetic nephropathy (DN), is the leading cause of end-stage renal disease (ESRD) worldwide [1]. It affects approximately 30% of patients with long-standing type 1 and type 2 diabetes [2], [3], and confers added risks of cardiovascular disease and mortality. DN is a progressive disorder that is characterized by proteinuria (abnormal loss of protein from the blood compartment into the urine) and gradual loss of kidney function. Early in its course, the kidneys are hypertrophic, and glomerular filtration is increased. However, with progression over several years, proteinuria and decline in kidney function set in, and may result in fibrosis and terminal kidney failure, necessitating costly renal replacement therapies, such as dialysis and renal transplantation. While current treatments that decrease proteinuria will moderately abate DN progression, recent studies show that even with delivery of optimal care, high risks of cardiovascular disease, ESRD and mortality persist [4], [5]. Therefore, discovery of genetic factors that influence development and susceptibility to DN is a critical step towards the identification of novel pathophysiologic mechanisms that may be targeted for interventions to improve the adverse clinical outcomes in diabetic patients. Whereas the degree of glycemia plays a pivotal role in DN, a subset of individuals with poorly controlled type 1 diabetes (T1D) do not develop DN. Furthermore, strong familial aggregation supports genetic susceptibility to DN. The sibling risk of DN has been estimated to be 2.3-fold [6]. While prior studies of individuals with T1D have reported on the possible existence of genetic associations for DN, results have been inconclusive. In GENIE, we leveraged three existing collections for T1D nephropathy (All Ireland Warren 3 Genetics of Kidneys in Diabetes UK Collection [UK-ROI], Finnish Diabetic Nephropathy Study [FinnDiane], and Genetics of Kidneys in Diabetes US Study [GoKinD US]) comprising 6,691 individuals to perform the most comprehensive and well powered DN susceptibility genome-wide association study (GWAS) and meta-analysis to date, with the aim to identify genetic markers associated with DN by meta-analyzing independent GWAS, imputed to HapMap CEU II (Table 1, Figure 1). As a result, we here present two new loci associated with ESRD and a locus suggestively associated with DN. 10.1371/journal.pgen.1002921.g001 Figure 1 Flow chart summarizing study design. We applied a two stage study design, where the top signals from the meta-analysis of three GENIE studies (UK-ROI, FinnDiane and GoKinD US) were followed up in phase two analysis, consisting of nine T1D cohorts. After combined meta-analysis, two signals reached genome-wide significance in the analysis of ESRD (P 12,000 individuals the AFF3 signal remained genome-wide significant (P = 1.2×10−8), and 5) we have provided supportive functional evidence that suggests AFF3 may be a relevant contributor to renal disease. Although survival bias is a possibility in the analyses of ESRD, longitudinal analysis revealed the association of the AFF3 and chromosome 15q26 loci with renal end-points and not with death. Experimental models provide independent evidence of AFF3 involvement in renal fibrosis and support an association of this locus with a renal phenotype. Importantly, despite our large sample size, we did not achieve genome-wide statistical significance for DN using a combined proteinuria/ESRD phenotype, suggesting that this phenotype may have been too heterogeneous to detect significant associations with a sample of this size. For example, lifelong glycemic control, a known risk factor for DN, is not well captured in most existing cohorts. Nevertheless, this study is the largest, well powered GWAS on DN to date. We demonstrated a suggestive signal of association at ERBB4 that is supported by experimental data showing haplotype specific mRNA expression in DN biopsies. Our findings reinforce the need for additional studies of patients with T1D and a homogeneous renal phenotype, in whom additional GWAS, fine-mapping and sequencing to uncover rare variants could be performed. Integration of our findings with ongoing GWAS in both type 1 and type 2 diabetes DN may also lead to discovery of additional genetic determinants of DN. The traditional phenotypic definition of DN for individuals with type 2 diabetes may be even more challenging for genetic studies given the heterogeneity of vascular complications and differential renal diagnoses. Several larger-scale GWAS have now been conducted for renal phenotypes [49]–[56], however in most cases the true disease-causing variant and functional impact for specific phenotypes remains to be established. Encouraging reports include the association of uromodulin with CKD [57], MYH9/APOL1 with non-diabetic ESRD [58], [59], and PLA2R1 with membranous nephropathy, where anti-PLA2R antibodies appear to predict activity of the disease as well as response to therapy [60]. Our findings point to two transcriptional networks centered around AFF3 and ERBB4 that may be operational in the pathogenesis of kidney disease in diabetes. Methods Ethics statement All human research was approved by the relevant institutional review boards, and conducted according to the Declaration of Helsinki. Study populations We implemented a two stage analysis, in which a GWAS was performed using a set of three discovery cohorts in the GENIE consortium, and top signals for the DN and ESRD analyses were analyzed further in the second phase in a set of nine independent cohorts (described below) with 5,873 patients in total. The patient numbers in the individual studies are given in Table S11. Additional details are provided in the online material Text S1. All Ireland, Warren 3, Genetics of Kidneys in Diabetes UK (UK-ROI) Collection [61] Inclusion criteria included white individuals with T1D, diagnosed before 31 years of age, whose parents and grandparents were born in the UK and Ireland. The case group comprised 903 individuals with persistent proteinuria (>500 mg/24 h) developing more than 10 years after the diagnosis of diabetes, hypertension (>135/85 mmHg and/or treatment with antihypertensive medication), and retinopathy; ESRD (27.2%) was defined as individuals requiring renal replacement therapy or having received a kidney transplant. Absence of DN was defined as persistent normal urine albumin excretion rate (AER; 2 out of 3 urine albumin to creatinine ratio [ACR] measurements 300 µg albumin/mg of urine creatinine). Cases were defined as people 18–54 years of age, with T1D for at least 10 years and DN, n = 903. Individuals recruited to the control group employed the same inclusion criteria as UK-ROI. Individuals were recruited at two study centers, George Washington University (GWU) and the Joslin Diabetes Centre (JDC) using differing methods of ascertainment and recruitment [64]. Analysis of the GoKinD US cohort was limited to individuals whose primary ethnicity was Caucasian. Collections genotyped in Phase 2 DNA was sought from worldwide case-control collections of individuals with T1D and known renal status. A total of 5,873 individuals from nine independent collections were genotyped or imputed for the top-ranked SNPs (n = 41 including 17 proxies), with the exception of the DCCT/EDIC cohort where GWAS data was imputed. All the patients included in the phase two analysis were adults of European descent and had T1D diagnosed before 35 years of age. Controls with normal AER had duration of T1D at least 15 years, and cases with DN had minimum T1D duration of 10 years. If a collection included patients with microalbuminuria, they were excluded from the primary analysis of DN, but included as controls in the analysis of ESRD versus non-ESRD. The main clinical characteristics of all the replication cohorts are shown in the Table S2 and the cohorts are described in Text S1. Phenotype definitions The primary phenotype of interest was DN, defined as individuals aged over 18, with T1D for at least 10 years and diabetic kidney disease. DN includes ESRD or persistent macroalbuminuria as defined in the cohort descriptions above. Controls were defined as individuals with T1D for at least 15 years but without any clinical evidence of kidney disease. Individuals with microalbuminuria were excluded from the primary DN analysis in all cohorts. Disease status definitions were consistent across all the study cohorts. Details of clinical characteristics for each cohort are defined in Table 1 and Table S2. We evaluated a second phenotype to gain further insights into the genetic basis of the most severe form of DN (leading to ESRD), and compared ESRD cases to all those without ESRD. This phenotype is referred to as the “ESRD” or “ESRD vs. non-ESRD” phenotype throughout the manuscript. We also considered individuals with ESRD compared to T1D controls with no clinical evidence of DN. Results for this comparison are given in the online supporting material (Tables S1, S6, S7, S9, S10), where this contrast is called “ESRD vs. normoalbuminuria” or “ESRD vs. normo”. Genotyping DNA from individuals in the UK-ROI collection were genotyped using the Omni1-Quad array (Illumina, San Diego, CA, USA) while FinnDiane samples employed Illumina's BeadArray 610-Quad array. Samples in UK-ROI and FinnDiane were excluded if they had insufficient DNA quality, quantity or poor genotype concordance with previous genotypes during the fingerprint evaluation stage. Existing genotype data for the GoKinD US genotype data was downloaded from dbGAP (phs000018.v2.p1, retrieved June 2010), containing updated genotype data from Affymetrix 500 K set (Affymetrix, Santa Clara, CA, USA). Genotype quality control Samples for UK-ROI and FinnDiane were excluded for insufficient DNA quality, quantity or poor genotype concordance with previous genotypes during a fingerprint evaluation stage. In the UK-ROI sample, 1,830 unique case (n = 872) and control (n = 958) individuals were submitted for genotyping on the Omni1-Quad. For FinnDiane, 3,651 individuals (cases, n = 1,934; controls n = 1,721) were submitted for genotyping on the 610-Quad. For all three discovery datasets (UK-ROI, FinnDiane, GoKinD US), uniform and extensive genotype quality control procedures were applied: SNPs were filtered for those with call rates greater than 90%, minor allele frequency (MAF) exceeding 1%, and concordance with Hardy Weinberg Equilibrium (HWE, P 0.4), and admixture assessment using principal components (plotted with HapMap reference panel, Figure S4). Additional quality control measures included test of missing by haplotype (P 10−8) and plate effects (P 0.6) with top SNP are denoted with blue dots, and final meta analysis P values (discovery+phase 2 results) as red triangles. Q-Q plots (panels B and D) evaluated inflation of the GWAS results and show the expected versus observed P values; the diagonal line is the line of identity. The inflation factor λ for the genomic control is indicated in the Q-Q plots. (TIF) Click here for additional data file. Figure S2 Box and whisker plots of normalized ERBB4 expression intensities in glomerulus (A,B) and tubulointerstitium (C,D) by genotype showing eQTL associations in tubulointerstitium. Both SNPs show significant eQTL associations in tubulointerstitial kidney biopsies of Pima Indians with type 2 diabetes and DN (P = 0.018 for rs1718640, P = 0.024 for rs17418814; linear regression using additive model). Association remained significant for rs17418640 when the subject with homozygous minor allele was excluded (P = 0.043). Associations with glomerular expression are not significant. Gene expression in kidneys was evaluated with Affy HGU-133A custom CDF probesets annotated to RefSeq transcripts NM_005235 and NM_001042599, and SNPs were genotyped with Affy 6.0 genotyping platform. Conditional analysis indicates rs17418814 is dependent on rs1718640 (P = 0.95 conditioned on rs1718640, versus rs1718640 P = 0.48 conditioned on rs17418814). Both SNPs lie within the same intron of the ERBB4 gene as rs7588550 that was suggestively associated with DN. (TIF) Click here for additional data file. Figure S3 Longitudinal analyses in FinnDiane for rs7583877 (AFF3) and rs12437854 (chromosome 15q26). Analyses assume an additive model of the SNP effects. The plotted survival curves have been truncated at the point at which fewer than five participants remained with the corresponding genotype. The genotype legend in each figure indicates the number of samples with the corresponding genotype, shown in parentheses. The P-value is indicated for the nominally significant associations (P<0.05). ns = not significant. The bottom part of each figure indicates the number of samples at risk at ten-year intervals. (TIF) Click here for additional data file. Figure S4 Rooted Principal Component Analysis of the discovery cohorts. Two first principal components (PC1 and PC2) are shown for (A) UK-ROI, (B) FinnDiane and (C) GoKinD US. Principal Component Analysis was calculated with EIGENSTRAT software including CEU, YRI and CBT from HapMap II as reference samples. (TIF) Click here for additional data file. Table S1 Top ranked SNPs selected for DN, ESRD vs. non ESRD, and ESRD vs. normoalbuminuria phenotypes. (DOC) Click here for additional data file. Table S2 Clinical characteristics and information on genotyping of the phase two cohorts. (XLS) Click here for additional data file. Table S3 Gene ontology analysis of all genes within ±1 Mbp of top GWAS signals: rs7583877/AFF3; rs12437854/15q26; rs7588550/ERBB4. (DOC) Click here for additional data file. Table S4 Gene expression in early DN versus living donor kidney biopsies. All genes within a 2 Mb window (1 Mb upstream and downstream) of the three main signals (rs7583877/AFF3, rs12437854/15q26, rs7588550/ERBB4) were studied. (DOC) Click here for additional data file. Table S5 Significantly enriched pathways (Genomatix Pathway System) for the ERBB4-correlated genes in early diabetic nephropathy. (DOC) Click here for additional data file. Table S6 Cross-sectional and longitudinal analyses in FinnDiane for rs7583877 (AFF3) and rs12437854 (chromosome 15q26). (DOC) Click here for additional data file. Table S7 Additional kidney phenotype analysis results for the three main loci. (DOC) Click here for additional data file. Table S8 P-value for association with DN related traits for the main signals after combined meta-analysis of DN and ESRD phenotypes. A1 is associated with increasing risk of ESRD/DN. (DOC) Click here for additional data file. Table S9 GENIE GWAS associations for SNPs that have been previously associated with T1D or chronic kidney disease. (DOC) Click here for additional data file. Table S10 Gene set enrichment analysis with MAGENTA. Gene sets with nominal P-value<0.01 for the three analyzed phenotypes. (DOC) Click here for additional data file. Table S11 Number of patients included in the study. (DOC) Click here for additional data file. Table S12 Quality control and filtering for the discovery GWAS data. (DOC) Click here for additional data file. Table S13 Physicians and nurses participating in the collection of the FinnDiane study subjects. (DOC) Click here for additional data file. Table S14 The DCCT/EDIC Study Research Group. (DOC) Click here for additional data file. Text S1 Supplementary Methods: Detailed explanation of employed methods. (DOC) Click here for additional data file.