Fine Mapping of Five Loci Associated with Low-Density Lipoprotein Cholesterol Detects Variants That Double the Explained Heritability

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Complex trait genome-wide association studies (GWAS) provide an efficient strategy for evaluating large numbers of common variants in large numbers of individuals and for identifying trait-associated variants. Nevertheless, GWAS often leave much of the trait heritability unexplained. We hypothesized that some of this unexplained heritability might be due to common and rare variants that reside in GWAS identified loci but lack appropriate proxies in modern genotyping arrays. To assess this hypothesis, we re-examined 7 genes ( APOE, APOC1, APOC2, SORT1, LDLR, APOB, and PCSK9) in 5 loci associated with low-density lipoprotein cholesterol (LDL-C) in multiple GWAS. For each gene, we first catalogued genetic variation by re-sequencing 256 Sardinian individuals with extreme LDL-C values. Next, we genotyped variants identified by us and by the 1000 Genomes Project (totaling 3,277 SNPs) in 5,524 volunteers. We found that in one locus ( PCSK9) the GWAS signal could be explained by a previously described low-frequency variant and that in three loci ( PCSK9, APOE, and LDLR) there were additional variants independently associated with LDL-C, including a novel and rare LDLR variant that seems specific to Sardinians. Overall, this more detailed assessment of SNP variation in these loci increased estimates of the heritability of LDL-C accounted for by these genes from 3.1% to 6.5%. All association signals and the heritability estimates were successfully confirmed in a sample of ∼10,000 Finnish and Norwegian individuals. Our results thus suggest that focusing on variants accessible via GWAS can lead to clear underestimates of the trait heritability explained by a set of loci. Further, our results suggest that, as prelude to large-scale sequencing efforts, targeted re-sequencing efforts paired with large-scale genotyping will increase estimates of complex trait heritability explained by known loci.

Author Summary

Despite the striking success of genome-wide association studies in identifying genetic loci associated with common complex traits and diseases, much of the heritable risk for these traits and diseases remains unexplained. A higher resolution investigation of the genome through sequencing studies is expected to clarify the sources of this missing heritability. As a preview of what we might learn in these more detailed assessments of genetic variation, we used sequencing to identify potentially interesting variants in seven genes associated with low-density lipoprotein cholesterol (LDL-C) in 256 Sardinian individuals with extreme LDL-C levels, followed by large scale genotyping in 5,524 individuals, to examine newly discovered and previously described variants. We found that a combination of common and rare variants in these loci contributes to variation in LDL-C levels, and also that the initial estimate of the heritability explained by these loci doubled. Importantly, our results include a Sardinian-specific rare variant, highlighting the need for sequencing studies in isolated populations. Our results provide insights about what extensive whole-genome sequencing efforts are likely to reveal for the understanding of the genetic architecture of complex traits.

Related collections

Most cited references 24

Record: found
Abstract: found
Article: not found

Newly identified loci that influence lipid concentrations and risk of coronary artery disease.

Cristen J Willer, Serena Sanna, Anne U. Jackson … (2008)

To identify genetic variants influencing plasma lipid concentrations, we first used genotype imputation and meta-analysis to combine three genome-wide scans totaling 8,816 individuals and comprising 6,068 individuals specific to our study (1,874 individuals from the FUSION study of type 2 diabetes and 4,184 individuals from the SardiNIA study of aging-associated variables) and 2,758 individuals from the Diabetes Genetics Initiative, reported in a companion study in this issue. We subsequently examined promising signals in 11,569 additional individuals. Overall, we identify strongly associated variants in eleven loci previously implicated in lipid metabolism (ABCA1, the APOA5-APOA4-APOC3-APOA1 and APOE-APOC clusters, APOB, CETP, GCKR, LDLR, LPL, LIPC, LIPG and PCSK9) and also in several newly identified loci (near MVK-MMAB and GALNT2, with variants primarily associated with high-density lipoprotein (HDL) cholesterol; near SORT1, with variants primarily associated with low-density lipoprotein (LDL) cholesterol; near TRIB1, MLXIPL and ANGPTL3, with variants primarily associated with triglycerides; and a locus encompassing several genes near NCAN, with variants strongly associated with both triglycerides and LDL cholesterol). Notably, the 11 independent variants associated with increased LDL cholesterol concentrations in our study also showed increased frequency in a sample of coronary artery disease cases versus controls.

0 comments Cited 480 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Genotype imputation.

Yun Li, Cristen Willer, Serena Sanna … (2009)

Genotype imputation is now an essential tool in the analysis of genome-wide association scans. This technique allows geneticists to accurately evaluate the evidence for association at genetic markers that are not directly genotyped. Genotype imputation is particularly useful for combining results across studies that rely on different genotyping platforms but also increases the power of individual scans. Here, we review the history and theoretical underpinnings of the technique. To illustrate performance of the approach, we summarize results from several gene mapping studies. Finally, we preview the role of genotype imputation in an era when whole genome resequencing is becoming increasingly common.

0 comments Cited 377 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Rare Variants Create Synthetic Genome-Wide Associations

Samuel P. Dickson, Kai Wang, Ian D. Krantz … (2010)

Introduction Efforts to fine map the causal variants responsible for genome-wide association studies (GWAS) signals have been largely predicated on the common disease common variant theory, postulating a common variant as the culprit for observed associations. This has led to extensive resequencing efforts that have been largely unsuccessful [1]–[5]. Here, we explore the possibility that part of the reason for this may be that the disease class causing an observed association may consist of multiple low-frequency variants across large regions of the genome—a phenomenon we call synthetic association. For convenience, these less common variants will be referred to here as “rare,” but we emphasize that we use this term loosely, only to refer to variants less common than those routinely studied in GWAS. The basic idea of how synthetic associations emerge in this model is illustrated in Figure 1, which shows how rare variants, by chance, can occur disproportionately in some parts of a gene genealogy. Any variant “higher up in the genealogy” that partitions those parts of the genealogy containing more disease variants than average will be identified as disease-associated. It is well appreciated that a noncausal variant will show association with a causal variant if the two are in strong linkage disequilibrium (LD). We use the previously introduced term synthetic association [6], however, to describe how such indirect association can occur between a common variant and at least one and possibly many rarer causal variants. Using the term synthetic as opposed to indirect emphasizes that the properties of the association signal are very different when the responsible variant or variants are much less frequent than the marker that carries the signal, as we detail below. 10.1371/journal.pbio.1000294.g001 Figure 1 Example genealogies showing causal variants and the strongest association for a common variant. (A) A genealogy with 10,000 original haplotypes was generated with 3,000 cases and 3,000 controls, genotype relative risk (γ) = 4, and nine causal variants. The branches containing the strongest synthetic association are indicated in blue. The branches containing the rare causal variants are in red. (B) A second genealogy was generated using the same parameters. These genealogies demonstrate two scenarios with genome-wide significant synthetic associations: the first (upper genealogy) had a high risk allele frequency (RAF = 0.49), and the second (lower genealogy) had a low RAF (0.08). To assess the tendency of rare disease-causing variants to create synthetic signals of association that are credited to single polymorphisms that are much more common in the population than the causal variants, we have simulated 10,000 haplotypes based on a coalescent model in a region either with or without recombination (Materials and Methods). We assumed that gene variants that influence disease have an allele frequency between 0.005 and 0.02, which is generally below the range of reliable detection (either by inclusion or indirect representation) using the genome-wide association platforms currently in use. We assumed a baseline probability of disease of φ for individuals with none of the rare genetic risk factors. The presence of at least one rare risk allele at the locus increased the probability of disease from φ to γ. We considered two values of φ (0.01, 0.1) and chose values of the penetrance γ such that the genotypic relative risk (GRR) of the rare causal variants varied incrementally between 2 and 6, where GRR is the ratio γ/φ. These values were chosen to explore the space around a GRR of 4, a threshold above which consistent linkage signals would be expected [7]. We simulated scenarios with one, three, five, seven, and nine rare causal variants. Results Across the conditions we have studied, not only is it possible to achieve genome-wide significance for common variants when one or more rare variants are the only contributors to disease, it is often the likely outcome (Figure 2). Overall, 30% of the simulations were able to detect an association with a common SNP at genome-wide significance (p 5%, Hardy-Weinberg equilibrium p-value >1×10−6, SNP call rate >95%), using the PLINK software [40]. For the sickle cell anemia GWAS, we compared 194 cases and 7,407 controls of inferred African ancestry via multidimensional scaling, with a genomic control inflation factor of 1.01. For hearing loss, we performed a GWAS on 418 cases and 6,892 control subjects, all of whom were of genetically inferred European ancestry via multidimensional scaling, with a genomic control inflation factor of 1.02.

0 comments Cited 290 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Genet

Journal ID (publisher-id): plos

Journal ID (pmc): plosgen

Title: PLoS Genetics

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-7390

ISSN (Electronic): 1553-7404

Publication date Collection: July 2011

Publication date (Print): July 2011

Publication date (Electronic): 28 July 2011

Volume: 7

Issue: 7

Electronic Location Identifier: e1002198

Affiliations

[1 ]Istituto di Ricerca Genetica e Biomedica, Consiglio Nazionale delle Ricerche (CNR), Monserrato, Italy

[2 ]Department of Biostatistics, Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America

[3 ]Dipartimento di Scienze Biomediche, Università di Sassari, Sassari, Italy

[4 ]Shardna Life Sciences, Pula, Italy

[5 ]Gene Expression and Genomics Unit, Research Resources Branch, National Institute on Aging, Baltimore, Maryland, United States of America

[6 ]Department of Community Medicine, Faculty of Health Sciences, University of Tromso, Tromso, Norway

[7 ]Department of Medicine, University of Eastern Finland, Kuopio, Finland

[8 ]Department of Public Health, Faculty of Medicine, Norwegian University of Science and Technology, Trondheim, Norway

[9 ]Diabetes Prevention Unit, National Institute for Health and Welfare, Helsinki, Finland

[10 ]Hjelt Institute, Department of Public Health, University of Helsinki, Helsinki, Finland

[11 ]South Ostrobothnia Central Hospital, Seinajoki, Finland

[12 ]Institute of Biomedicine/Physiology, University of Eastern Finland, Kuopio, Finland

[13 ]Kuopio Research Institute of Exercise Medicine, Kuopio, Finland

[14 ]Laboratory of Genetics, National Institute on Aging, Baltimore, Maryland, United States of America

Georgia Institute of Technology, United States of America

Author notes

* E-mail: goncalo@ 123456umich.edu (GRA); serena.sanna@ 123456inn.cnr.it (SS)

Conceived and designed the experiments: SS GRA DS. Performed the experiments: AM WHW RN GU. Analyzed the data: SS BL CS AUJ. Contributed reagents/materials/analysis tools: MU DS GRA. Wrote the paper: SS GRA. Contributed to genotyping chip design: BL HMK GRA. Collected phenotypes: MGP. Provided replication samples: GM AS FS MAP IN ML KH JT MB TAL RR. Provided critical revisions: BL AM FC MU DS RN MB.

¶ These authors also contributed equally to this work.

Article

Publisher ID: PGENETICS-D-11-00557

DOI: 10.1371/journal.pgen.1002198

PMC ID: 3145627

PubMed ID: 21829380

SO-VID: 7a3b957a-b34e-4247-8199-b8c208762e6e

Copyright © Sanna et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 18 March 2011

Date accepted : 7 June 2011

Page count

Pages: 10

Comments

Comment on this article

scite_

Cited by 43

See all cited by

Fine Mapping of Five Loci Associated with Low-Density Lipoprotein Cholesterol Detects Variants That Double the Explained Heritability

Read this article at

Abstract

Author Summary

Related collections

Genome Engineering using CRISPR

Most cited references 24

Newly identified loci that influence lipid concentrations and risk of coronary artery disease.

Genotype imputation.

Rare Variants Create Synthetic Genome-Wide Associations

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Categories

Comments

Comment on this article

Similar content 15

Cited by 43