Genome‐Wide Association Analysis Reveals Genetic Heterogeneity of Sjögren's Syndrome According to Ancestry

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Objective

The Sjögren's International Collaborative Clinical Alliance (SICCA) is an international data registry and biorepository derived from a multisite observational study of participants in whom genotyping was performed on the Omni2.5M platform and who had undergone deep phenotyping using common protocol‐directed methods. The aim of this study was to examine the genetic etiology of Sjögren's syndrome (SS) across ancestry and disease subsets.

Methods

We performed genome‐wide association study analyses using SICCA subjects and external controls obtained from dbGaP data sets, one using all participants (1,405 cases, 1,622 SICCA controls, and 3,125 external controls), one using European participants (585, 966, and 580, respectively), and one using Asian participants (460, 224, and 901, respectively) with ancestry adjustments via principal components analyses. We also investigated whether subphenotype distributions differ by ethnicity, and whether this contributes to the heterogeneity of genetic associations.

Results

We observed significant associations in established regions of the major histocompatibility complex (MHC), IRF5, and STAT4 ( P = 3 × 10 ⁻⁴², P = 3 × 10 ⁻¹⁴, and P = 9 × 10 ⁻¹⁰, respectively), and several novel suggestive regions (those with 2 or more associations at P < 1 × 10 ⁻⁵). Two regions have been previously implicated in autoimmune disease: KLRG1 ( P = 6 × 10 ⁻⁷ [Asian cluster]) and SH2D2A ( P = 2 × 10 ⁻⁶ [all participants]). We observed striking differences between the associations in Europeans and Asians, with high heterogeneity especially in the MHC; representative single‐nucleotide polymorphisms from established and suggestive regions had highly significant differences in the allele frequencies in the study populations. We showed that SSA/SSB autoantibody production and the labial salivary gland focus score criteria were associated with the first worldwide principal component, indicative of higher non‐European ancestry ( P = 4 × 10 ⁻¹⁵ and P = 4 × 10 ⁻⁵, respectively), but that subphenotype differences did not explain most of the ancestry differences in genetic associations.

Conclusion

Genetic associations with SS differ markedly according to ancestry; however, this is not explained by differences in subphenotypes.

Related collections

Most cited references 19

Record: found
Abstract: found
Article: not found

Quality control and quality assurance in genotypic data for genome-wide association studies.

Peter Kraft, Claire Harris, Xiuwen Zheng … (2010)

Genome-wide scans of nucleotide variation in human subjects are providing an increasing number of replicated associations with complex disease traits. Most of the variants detected have small effects and, collectively, they account for a small fraction of the total genetic variance. Very large sample sizes are required to identify and validate findings. In this situation, even small sources of systematic or random error can cause spurious results or obscure real effects. The need for careful attention to data quality has been appreciated for some time in this field, and a number of strategies for quality control and quality assurance (QC/QA) have been developed. Here we extend these methods and describe a system of QC/QA for genotypic data in genome-wide association studies (GWAS). This system includes some new approaches that (1) combine analysis of allelic probe intensities and called genotypes to distinguish gender misidentification from sex chromosome aberrations, (2) detect autosomal chromosome aberrations that may affect genotype calling accuracy, (3) infer DNA sample quality from relatedness and allelic intensities, (4) use duplicate concordance to infer SNP quality, (5) detect genotyping artifacts from dependence of Hardy-Weinberg equilibrium test P-values on allelic frequency, and (6) demonstrate sensitivity of principal components analysis to SNP selection. The methods are illustrated with examples from the "Gene Environment Association Studies" (GENEVA) program. The results suggest several recommendations for QC/QA in the design and execution of GWAS. (c) 2010 Wiley-Liss, Inc.

0 comments Cited 168 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A simplified quantitative method for assessing keratoconjunctivitis sicca from the Sjögren's Syndrome International Registry.

John P Whitcher, Caroline H. Shiboski, Stephen Shiboski … (2010)

To describe, apply, and test a new ocular grading system for assessing keratoconjunctivitis sicca (KCS) using lissamine green and fluorescein. Prospective, observational, multicenter cohort study. The National Institutes of Health-funded Sjögren's Syndrome International Registry (called Sjögren's International Collaborative Clinical Alliance [SICCA]) is developing standardized classification criteria for Sjögren syndrome (SS) and is creating a biospecimen bank for future research. Eight SICCA ophthalmologists developed a new quantitative ocular grading system (SICCA ocular staining score [OSS]), and we analyzed OSS distribution among the SICCA cohort and its association with other phenotypic characteristics of SS. The SICCA cohort includes participants ranging from possibly early SS to advanced disease. Procedures include sequenced unanesthetized Schirmer test, tear break-up time, ocular surface staining, and external eye examination at the slit lamp. Using statistical analyses and proportional Venn diagrams, we examined interrelationships between abnormal OSS (>or=3) and other characteristics of SS (labial salivary gland [LSG] biopsy with focal lymphocytic sialadenitis and focus score >1 positive anti-SS A antibodies, anti-SS B antibodies, or both). Among 1208 participants, we found strong associations between abnormal OSS, positive serologic results, and positive LSG focus scores (P < .0001). Analysis of the overlapping relationships of these 3 measures defined a large group of participants who had KCS without other components of SS, representing a clinical entity distinct from the KCS associated with SS. This new method for assessing KCS will become the means for diagnosing the ocular component of SS in future classification criteria. We find 2 forms of KCS whose causes may differ. (c) 2010 Elsevier Inc. All rights reserved.

0 comments Cited 133 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Meta-Analysis of Genome-Wide Association Studies in Celiac Disease and Rheumatoid Arthritis Identifies Fourteen Non-HLA Shared Loci

Alexandra Zhernakova, Eli Stahl, Gosia Trynka … (2011)

Introduction Autoimmune disorders, including rheumatoid arthritis (RA) and celiac disease (CD), affect about 5% of the population and have a complex genetic background. Family-based epidemiology studies suggest that there is a shared genetic basis between the two autoimmune diseases [1]. Recent genome-wide association studies (GWAS) have confirmed HLA and identified at least 26 other non-HLA genetic loci with common alleles associated to each disease (Table S1 and S2) [2], [3]. The strongest genetic risk factor is the HLA locus [2], [3], where different alleles confer risk of the two diseases. Six other risk loci outside of the HLA locus are shared between CD and RA and include MMEL/TNFRSF14 [2], [4], REL [2], [5], [6], ICOS-CTLA4 [2], [3], [5], [7], IL2-IL21 [2], [3], [8], [9], [10], TNFAIP3 [2], [3], [6], [11], and TAGAP [2], [3], [8], (Chen et al, submitted) (Table 1 and Figure 1). These shared risk loci have emerged by simple cross-comparison across published studies, rather than a rigorous and systematic analysis of an integrated dataset. Because of the nature of these reports, it is unknown whether the other CD and RA risk alleles confer risk of both diseases. Moreover, it is unknown whether there are additional shared risk alleles that have not yet been discovered in any one disease. 10.1371/journal.pgen.1002004.g001 Figure 1 Established CD and RA SNPs and their association across diseases. (A) Known CD SNPs in RA. The figure represents OR and CI for the established CD SNPs (p 0.9) in both diseases; in other three loci – CTLA4, REL and TAGAP – the most associated SNPs in CD and RA are not in strong LD with each other (r2 50,000 case-control samples. As more samples and SNPs are genotyped between these diseases, additional risk alleles will be discovered. Third, we did not attempt to fine-map the 26 established risk loci for both autoimmune diseases to determine if a single allele is responsible for risk in both autoimmune diseases. And fourth, we made no attempt to search for low-frequency or rare variants that are shared between RA and CD. Implementation of newer sequencing technologies will be required to search for rare risk variants. In summary, our study adds four novel loci to established RA and CD risk loci (CD247, UBE2L3, DDX6, and UBASH3A). It also adds four loci previously established in one or the other disease to the list of shared CD-RA risk loci (SH2B3, 8q24.2, STAT4, and TRAF1-C5). With six previously established CD-RA risk loci, there are now 14 shared CD-RA risk loci, out of 50 established loci for either of the two autoimmune diseases. We emphasize that these are conservative estimates of shared risk loci between the two diseases, as our study may be underpowered to detect common alleles of modest effect size, and we have not considered genetic models in which different alleles within one locus contribute to risk of the two diseases. In addition to the HLA associations, these shared risk loci clearly point to the critical role of antigen presentation via MHC class II molecules to the T-cell receptor, and subsequent activation and differentiation of T-cells in shared disease pathogenesis. Materials and Methods Ethics statement Institutional review boards at each collection site approved the study, and all individuals gave their informed consent. Sample collection CD GWAS dataset CD case-control GWA study included 15,283 individuals (4533 cases, 10750 controls) from 5 populations: Finnish (FIN) (674 cases, 647 controls), Italian (IT) (541 cases, 497 controls), Dutch (NL, 876 cases, 803 controls), and two collections from UK population, UK1 (737 cases, 2596 controls) and UK3 (1922 cases, 1849 controls) (described in details in[2]).The genotyping of all cohorts except UK1 cases was done on Illumina platforms including 550K SNPs (either Illumina Hap550, or Illumina 610 or 670 Quad, or Illumina 1.2 M). The genotyping of UK1 cases (n = 737) was done on Illumina 317K arrays. The subset of SNPs successfully genotyped on Illumina 550 and Quadr platforms, but not on Hap300 platform (n = 196860) was further imputed in the UK1 dataset, using Plink and HapMap Phase 2 European CEU founders as a reference panel [35]. RA GWAS dataset The RA meta-analysis includes 5,539 autoantibody positive RA cases and 17,231 controls of European ancestry as described previously [3]. This study comprises six GWAS case-control collections, genotyped on various platforms. The imputation was conducted on GWAS genotype data for each GWAS collection separately, using the IMPUTE software [36] and haplotype-phased HapMap Phase 2 European CEU founders as a reference panel. In total, 2.56 million SNPs were imputed. Identity by state (IBS) analysis was run on controls from both CD and RA GWAS datasets. The overlapping controls genotyped in both CD and RA datasets were excluded from the RA analysis. Replication cohorts The replication cohorts included 2,169 CD cases and 2,255 controls, and 2,845 antibody-positive RA cases and 4,944 controls. The CD replication cohorts included three case-control collections from Ireland, Italy and Poland; all collections were geographically matched and are described previously [2]. The five RA replication collections included (1) CCP or RF positive Dutch cases from Groningen and Nijmegen, together with geographically matched controls (Replication cohort 1, R1); (2) CCP positive white individuals from North America (Replication cohort 2, R2; this collection is called i2b2); (3) North American RF positive cases and controls matched on gender, age, and grandparental country of origin from the Genomics Collaborative Initiative (GCI, Replication cohort 3, R3); (4) CCP or RF positive Dutch cases and controls from Leiden University Medical Center (LUMC; Replication cohort 4, R4); and (5) CCP positive cases drawn from North American clinics and controls from the New York Cancer Project (together this collection is called NARAC-II), matched on ancestry informative markers data (Replication cohort 5, R5). All cohorts except i2b2 were described in detail in [3], whereas i2b2 is described in [37]. Summary information on these samples is presented in Table S11. Genotyping Replication analysis of 15 SNPs was performed on the Sequenom iPlex platform in three centers – (1) Broad institute (all CD cases and controls, and RA replication cohorts R1 and R2); (2) Celera Diagnostics (Alameda California, USA; RA replication cohort R3 and R4); and (3) National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS, RA replication cohort R5). (See Table S11 for details). If the SNPs could not be designed into the iPLEX pool, then a proxy SNP was included. Information on the iPLEX design, proxies and cohorts genotyped in different centers is presented in Tables S11 and S12. We excluded SNPs in each replication collection if they were missing >10% genotype data, 0.1 in the HapMap2 reference panel. Pairwise LD tables were generated from the HapMap2 release 24 phased haplotype data distributed with the IMPUTE software; r2 values were calculated for all SNPs within 1 Mb of each other. For a given analysis, the most strongly associated SNPs (after removing 1 Mb around known associated SNPs) in one disease were retained. We also filtered for SNPs with P 0.1) with any of the 1000 genome SNPs located within Illumina probe sequences. Supporting Information Figure S1 Manhattan plots for GWAS analysis. Manhattan plot for GWAS in CD (A), RA (B), CD-RA with directional meta-analysis (C) and CD-RA with opposite allelic effect (D). Figure E is a combined picture of the four analyses. Yellow – CD; blue – RA; green – CD-RA directional; purple – CD-RA opposite allele. In figure E, the 14 shared CD-RA loci are annotated. The HLA locus (chr6: 20–40 MB) is excluded from the analysis. In RA, three SNPs in PTPN22 locus are associated at p<10-24 and therefore are excluded from the plot for the purpose of scale. The PTPN22 SNPs excluded from RA plot are: rs2476601 P(RA) = 3.6×10−68, rs2358994 P(RA) = 8.2×10−31, and rs1230661 P(RA) = 6.1×10−25. (PDF) Click here for additional data file. Figure S2 Cis eQTL genotype – expression correlation analyses in associated SNPs. Individual level gene expression data (residual variance after Transcriptional Components removed) from 1469 PAXgene samples. Spearman Rank Correlation coefficients and P values are shown for HT-12 and Ref. 8 data and for meta-analysis results. Right Y axis, average expression rank, is a measure of how strongly the tested probe is expressed amongst all probes in the dataset. Unannotated probes are manually plotted and localized to the following transcripts: Probe 2810202 – PHF19, Probe 6980470 – TMPRSS3, Probe 1230242 – UBE2L3. (PDF) Click here for additional data file. Table S1 Established CD SNPs and their association to RA. Candidate genes in the blocks are mentioned. Top P-value in CD is indicated as in the reference paper [2]. CD_P column indicates the p-value in CD-GWAS dataset (4,533 cases and 10,750 controls); RA_P column indicates the p-value in RA-GWAS dataset (5,539 cases and 17,231 controls). OR is given for the minor allele. (DOC) Click here for additional data file. Table S2 Established RA SNPs and their association to CD. Candidate genes in the blocks are mentioned. Top P-value in RA is indicated as in the reference papers [3], [4], [5], [10], [11], [39], [40], [41], [42], [43]. CD_P column indicates the p-value in CD-GWAS dataset; RA_P column indicates the p-value in RA-GWAS dataset; OR is given for the minor allele. The CD results for RA SNPs that were not genotyped in CD GWAS (not present on Illumina Hap550 genotyping array), are either imputed or estimated from the best genotyped proxy SNP (indicated in column “genotyped/imputed”). When proxies were used, the r2 with RA SNP is indicated in column ‘r2 for proxy’. For imputed SNPs the imputation score is annotated in column ‘Imp. score’. * - the perfect proxy rs13017599 was genotyped in the reference paper. + - the association in TNFRSF14 does not reach P<5×10−8, however this locus was included as RA-established locus as it was implicated in several independent studies [3], [39], [41]. (DOC) Click here for additional data file. Table S3 Distribution of CD associated SNPs (P<0.001) in RA dataset and RA associated SNPs (P<0.001) in CD datasets. 3a. Goodness of fit tests of no association, using Fisher's Rule for combining P-values. 3b Results of Kolmogorov-Smirnov and Wilcoxon rank sum tests for non-random distribution of associated SNPs across diseases. N – number of SNPs associated to CD and RA at p<0.001 after LD-pruning. Df – degrees of freedom. (DOC) Click here for additional data file. Table S4 SNPs associated to CD-RA with P<1×10-5 and CD and RA with P<0.01; directional analysis. The table includes all SNPs associated to CD-RA in directional GWAS meta-analysis with P(combined)<1×10-5 (column ‘CD-RA_P dir’) and P-value per diseases P<0.01 (columns ‘CD_P’ and ‘RA_P’). OR is given for the reference (minor) allele. (DOC) Click here for additional data file. Table S5 CD-RA meta-analysis results, directional analysis – one top SNP per loci. One most associated SNP per locus is indicated. OR is given for the reference (minor) allele. Cut-off for the P-values is the same as in Table S4. SNPs are sorted for the P-value in CDRA meta-analysis. Loci established in both diseases are indicated in bold. (DOC) Click here for additional data file. Table S6 SNPs associated to CD-RA with P<1×10-5 and CD and RA with P<0.01; analysis of opposite allelic effect. The table includes all SNPs associated to CD-RA in GWAS meta-analysis of opposite alleleic effect with P(combined) P<1×10-5 (column ‘CD-RA_P_opp’) and P-value per diseases P<0.01 (columns CD_P and RA_P). (DOC) Click here for additional data file. Table S7 CDRA meta-analysis results, opposite allelic effect – one top SNP per loci. One most associated SNP per locus is indicated. OR is given for the reference (minor) allele. Cut-off for the P-values is the same as in Table S6. SNPs are sorted for the P-value in CD-RA meta-analysis. Loci established in both diseases are indicated in bold. (DOC) Click here for additional data file. Table S8 GRAIL analysis of associated loci. The P-value for each candidate gene is based on the number of relationships to other associated genes listed in the third column. GRAIL is available at http://www.broadinstitute.org/mpg/grail/. (DOC) Click here for additional data file. Table S9 Shared CD-RA risk variants correlated with cis gene expression. ‘HT-12’ comprise 1240 individuals with blood gene expression assayed using Illumina Human HT-12v3 arrays, ‘Ref-8v2’ comprise 229 individuals with blood gene expression assayed using Illumina Human-Ref-8v2 arrays. ASpearman rank correlation of genotype and residual variance in transcript expression. Meta-analysis eQTL P value shown if both datasets had identical probes. See Figure S2 for detailed results and Materials and Methods for sample information and references. (DOC) Click here for additional data file. Table S10 Characteristics of 23 candidate genes in shared loci. (DOC) Click here for additional data file. Table S11 Information on replication cohorts. Case-control collections included to the replication step in CD (top panel) and RA (bottom panel). For each collection, we list the source of controls, geographic origin, autoantibody status of RA cases, numbers of cases and controls, genotyping platform and genotyping center, and the strategy used to correct for case-control population stratification. See Materials and Methods for additional details. NIAMS – National Institute of Arthritis and Musculoskeletal and Skin Diseases. (DOC) Click here for additional data file. Table S12 Information on proxies used in three iPLEX pools. R2 - r2 between GWAS and SNP pools used in replication step (HapMap CEU, as calculated in SNAP (http://www.broadinstitute.org/mpg/snap/)). A – iPLEX pool used in Broad institute (BI) was used to genotype all replication cohorts for celiac disease and replication cohorts R1 and R2 in rheumatoid arthritis, as indicated in Table S11. (DOC) Click here for additional data file.

0 comments Cited 114 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Lindsey A. Criswell: Lindsey.Criswell@ucsf.edu

Journal

Journal ID (nlm-ta): Arthritis Rheumatol

Journal ID (doi): 10.1002/(ISSN)2326-5205

Journal ID (publisher-id): ART

Title: Arthritis & Rheumatology (Hoboken, N.j.)

Publisher: John Wiley and Sons Inc. (Hoboken )

ISSN (Print): 2326-5191

ISSN (Electronic): 2326-5205

Publication date (Electronic): 09 May 2017

Publication date (Print): June 2017

Volume: 69

Issue: 6 ( doiID: 10.1002/art.v69.6 )

Pages: 1294-1305

Affiliations

[ ¹ ] University of California San Francisco

[ ² ] University of Washington Seattle

[ ³ ] Johns Hopkins University Baltimore Maryland

[ ⁴ ] Guy's, King's, and St. Thomas' Dental Institute, King's College London London UK

[ ⁵ ] University of Buenos Aires Buenos Aires Argentina

[ ⁶ ] Rigshospitalet Copenhagen Denmark

[ ⁷ ] Aravind Eye Hospital Madurai India

[ ⁸ ] Kanazawa Medical University Ishikawa Japan

[ ⁹ ] University of Pennsylvania Philadelphia

[ ¹⁰ ] Peking Union Medical College Hospital Beijing China

Author notes

[*] [* ]Address correspondence to Lindsey A. Criswell, MD, MPH, DSc, 513 Parnassus Avenue S‐857, University of California, San Francisco, San Francisco, CA 94143. E‐mail: Lindsey.Criswell@ 123456ucsf.edu .

Article

Publisher ID: ART40040

DOI: 10.1002/art.40040

PMC ID: 5449251

PubMed ID: 28076899

SO-VID: aa1fea7e-e24b-49ae-9a20-fc36ab47df28

License:

This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.

History

Date received : 21 September 2016

Date accepted : 05 January 2017

Page count

Figures: 3, Tables: 3, Pages: 12, Words: 9586

Funding

Funded by: Center for Inherited Disease Research (CIDR)

Funded by: 1 X01 HG005274‐01

Funded by: National Institutes of Health

Award ID: HHSN268200782096C

Funded by: Gene Environment Association Studies Coordinating Center

Award ID: U01HG004446

Award ID: P01‐CA089392

Funded by: University of Wisconsin Transdisciplinary Tobacco Use Research Center

Award ID: P50‐DA019706

Award ID: P50‐A084724

Custom metadata

source-schema-version-number 2.0

component-id art40040

cover-date June 2017

details-of-publishers-convertor Converter:WILEY_ML3GV2_TO_NLMPMC version:version=5.5.3 mode:remove_FC converted:06.12.2018

Genome‐Wide Association Analysis Reveals Genetic Heterogeneity of Sjögren's Syndrome According to Ancestry

Read this article at

Abstract

Objective

Methods

Results

Conclusion

Related collections

Cures Within Reach: Autoimmune Lymphoproliferative Syndrome (ALPS)

Most cited references 19

Quality control and quality assurance in genotypic data for genome-wide association studies.

A simplified quantitative method for assessing keratoconjunctivitis sicca from the Sjögren's Syndrome International Registry.

Meta-Analysis of Genome-Wide Association Studies in Celiac Disease and Rheumatoid Arthritis Identifies Fourteen Non-HLA Shared Loci

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 452

Cited by 38

Most referenced authors 782