48
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      All SNPs Are Not Created Equal: Genome-Wide Association Studies Reveal a Consistent Pattern of Enrichment among Functionally Annotated SNPs

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate (TDR = 1−FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn's disease GWAS, where we find a hundredfold variation in replication rate across genic categories. Applying a well-established sFDR methodology we demonstrate the utility of stratification for improving power of GWAS in complex phenotypes, with increased rejection rates from 20% in height to 300% in schizophrenia with traditional FDR and sFDR both fixed at 0.05. Our analyses demonstrate an inherent stratification among GWAS SNPs with important conceptual implications that can be leveraged by statistical methods to improve the discovery of loci.

          Author Summary

          Modern genome-wide association studies (GWAS) have failed to identify large portions of the genetic basis of common, complex traits. Recent work suggested this could be because many genetic variants, each with individually small effects, compose their genetic architecture, limiting the power of GWAS. Moreover, these variants appear more abundantly in and near genes. Using genome annotations, summary statistics from several of the largest GWAS, and established statistical methods for quantifying distributions of test statistics, we show a consistency across studies. Namely, we show that, across all assessed traits, the test statistics resulting from SNPs that are related to the 5′ UTR of genes show the largest abundance of associations, while SNPs related to exons and the 3′UTR are also enriched. SNPs related to introns are only moderately enriched, and intergenic SNPs show a depletion of associations relative to the average SNP. This enrichment corresponds directly to increased replication across independent samples and can be incorporated a priori into statistical methods to improve discovery and prediction. Our results contribute to on-going debates about the functional nature of the genetic architecture of complex traits and point to avenues for leveraging existing GWAS data for discovery in future GWA and sequencing studies.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: not found
          • Article: not found

          Evolution at two levels in humans and chimpanzees.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The UCSC Known Genes.

            The University of California Santa Cruz (UCSC) Known Genes dataset is constructed by a fully automated process, based on protein data from Swiss-Prot/TrEMBL (UniProt) and the associated mRNA data from Genbank. The detailed steps of this process are described. Extensive cross-references from this dataset to other genomic and proteomic data were constructed. For each known gene, a details page is provided containing rich information about the gene, together with extensive links to other relevant genomic, proteomic and pathway data. As of July 2005, the UCSC Known Genes are available for human, mouse and rat genomes. The Known Genes serves as a foundation to support several key programs: the Genome Browser, Proteome Browser, Gene Sorter and Table Browser offered at the UCSC website. All the associated data files and program source code are also available. They can be accessed at http://genome.ucsc.edu. The genomic coverage of UCSC Known Genes, RefSeq, Ensembl Genes, H-Invitational and CCDS is analyzed. Although UCSC Known Genes offers the highest genomic and CDS coverage among major human and mouse gene sets, more detailed analysis suggests all of them could be further improved.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Finding genes that underlie complex traits.

              Phenotypic variation among organisms is central to evolutionary adaptations underlying natural and artificial selection, and also determines individual susceptibility to common diseases. These types of complex traits pose special challenges for genetic analysis because of gene-gene and gene-environment interactions, genetic heterogeneity, low penetrance, and limited statistical power. Emerging genome resources and technologies are enabling systematic identification of genes underlying these complex traits. We propose standards for proof of gene discovery in complex traits and evaluate the nature of the genes identified to date. These proof-of-concept studies demonstrate the insights that can be expected from the accelerating pace of gene discovery in this field.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                April 2013
                April 2013
                25 April 2013
                : 9
                : 4
                : e1003449
                Affiliations
                [1 ]Cognitive Sciences Graduate Program, University of California San Diego, La Jolla, California, United States of America
                [2 ]Center for Human Development, University of California San Diego, La Jolla, California, United States of America
                [3 ]Multimodal Imaging Laboratory, University of California San Diego, La Jolla, California, United States of America
                [4 ]Department of Psychiatry, University of California San Diego, La Jolla, California, United States of America
                [5 ]Scripps Health, La Jolla, California, United States of America
                [6 ]The Scripps Translational Science Institute, The Scripps Research Institute, La Jolla, California, United States of America
                [7 ]Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, California, United States of America
                [8 ]Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
                [9 ]MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, United Kingdom
                [10 ]Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
                [11 ]Institute of Clinical Medicine, University of Oslo, Oslo, Norway
                [12 ]Division of Mental Health and Addiction, Oslo University Hospital, Oslo, Norway
                [13 ]Department of Cognitive Sciences, University of California San Diego, La Jolla, California, United States of America
                [14 ]Department of Radiology, University of California San Diego, La Jolla, California, United States of America
                [15 ]Department of Neurosciences, University of California San Diego, La Jolla, California, United States of America
                Georgia Institute of Technology, United States of America
                Author notes

                ¶ Memberships of the consortia are provided in the Acknowledgments.

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: AJS AMD WKT NJS OAA. Performed the experiments: AMD WKT AJS. Analyzed the data: AMD WKT AJS. Contributed reagents/materials/analysis tools: PP AT JCR JRK HF PFS MCO. Wrote the paper: AJS OAA WKT AMD.

                * Collaborator is also a member of The Schizophrenia Psychiatric Genomics Consortium.

                Article
                PGENETICS-D-12-02185
                10.1371/journal.pgen.1003449
                3636284
                23637621
                3f0fff9b-5101-4aa6-bf5f-fd89d77e73df
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 28 August 2012
                : 28 February 2013
                Page count
                Pages: 1
                Funding
                AJS was supported by NIH grants RC2DA029475 and R01HD061414 and by the Robert J. Glushko and Pamela Samuelson Graduate Fellowship. WKT was supported by NIA grant SR01AG022381-09. PP, AT, and NJS were supported in part by NIH/NCRR grant number UL1 RR025774. OAA was supported by the Research Council of Norway (183782/V50) and the South East Norway Health Authority (2010-074). AMD was supported by NIH grants RC2DA029475, R01EB000790, R01AG031224, R01AG022381, P50MH081755, P50NS022343, and U54NS056883. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Genome Analysis Tools
                Genome Scans
                Genome-Wide Association Studies
                Functional Genomics
                Genetics
                Human Genetics
                Genetic Association Studies
                Genome-Wide Association Studies
                Genome-Wide Association Studies
                Genomics
                Genome Analysis Tools
                Mathematics
                Statistics
                Biostatistics
                Statistical Methods

                Genetics
                Genetics

                Comments

                Comment on this article