82
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Testing for an Unusual Distribution of Rare Variants

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Technological advances make it possible to use high-throughput sequencing as a primary discovery tool of medical genetics, specifically for assaying rare variation. Still this approach faces the analytic challenge that the influence of very rare variants can only be evaluated effectively as a group. A further complication is that any given rare variant could have no effect, could increase risk, or could be protective. We propose here the C-alpha test statistic as a novel approach for testing for the presence of this mixture of effects across a set of rare variants. Unlike existing burden tests, C-alpha, by testing the variance rather than the mean, maintains consistent power when the target set contains both risk and protective variants. Through simulations and analysis of case/control data, we demonstrate good power relative to existing methods that assess the burden of rare variants in individuals.

          Author Summary

          Developments in sequencing technology now enable us to assay all genetic variation, much of which is extremely rare. We propose to test the distribution of rare variants we observe in cases versus controls. To do so, we present a novel application of the C-alpha statistic to test these rare variants. C-alpha aims to determine whether the set of variants observed in cases and controls is a mixture, such that some of the variants confer risk or protection or are phenotypically neutral. Risk variants are expected to be more common in cases; protective variants more common in controls. C-alpha is sensitive to this imbalance, regardless of its origin—risk, protective, or both—but is ideally suited for a mixture of protective and risk variants. Variation in APOB nicely illustrates a mixture, in that certain rare variants increase triglyceride levels while others decrease it. The hallmark feature of C-alpha is that it uses the distribution of variation observed in cases and controls to detect the presence of a mixture, thus implicating genes or pathways as risk factors for disease.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: found
          • Article: not found

          Multiple rare alleles contribute to low plasma levels of HDL cholesterol.

          Heritable variation in complex traits is generally considered to be conferred by common DNA sequence polymorphisms. We tested whether rare DNA sequence variants collectively contribute to variation in plasma levels of high density lipoprotein cholesterol (HDL-C). We sequenced three candidate genes (ABCA1, APOA1, and LCAT) that cause Mendelian forms of low HDL-C levels in individuals from a population-based study. Nonsynonymous sequence variants were significantly more common (16% versus 2%) in individuals with low HDL-C ( 95th percentile). Similar findings were obtained in an independent population, and biochemical studies indicated that most sequence variants in the low HDL-C group were functionally important. Thus, rare alleles with major phenotypic effects contribute significantly to low plasma HDL-C levels in the general population.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Polymorphisms associated with cholesterol and risk of cardiovascular events.

            Common single-nucleotide polymorphisms (SNPs) that are associated with blood low-density lipoprotein (LDL) or high-density lipoprotein (HDL) cholesterol modestly affect lipid levels. We tested the hypothesis that a combination of such SNPs contributes to the risk of cardiovascular disease. We studied SNPs at nine loci in 5414 subjects from the cardiovascular cohort of the Malmö Diet and Cancer Study. We first validated the association between SNPs and either LDL or HDL cholesterol and subsequently created a genotype score on the basis of the number of unfavorable alleles. We used Cox proportional-hazards models to determine the time to the first cardiovascular event in relation to the genotype score. All nine SNPs showed replication of an association with levels of either LDL or HDL cholesterol. With increasing genotype scores, the level of LDL cholesterol increased from 152 mg to 171 mg per deciliter (3.9 to 4.4 mmol per liter), whereas HDL cholesterol decreased from 60 mg to 51 mg per deciliter (1.6 to 1.3 mmol per liter). During follow-up (median, 10.6 years), 238 subjects had a first cardiovascular event. The genotype score was associated with incident cardiovascular disease in models adjusted for covariates including baseline lipid levels (P<0.001). The use of the genotype score did not improve the clinical risk prediction, as assessed by the C statistic. However, there was a significant improvement in risk classification with the use of models that included the genotype score, as compared with those that did not include the genotype score. A genotype score of nine validated SNPs that are associated with modulation in levels of LDL or HDL cholesterol was an independent risk factor for incident cardiovascular disease. The score did not improve risk discrimination but did modestly improve clinical risk reclassification for individual subjects beyond standard clinical factors. Copyright 2008 Massachusetts Medical Society.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST).

              A method is described to discover if a gene carries one or more allelic mutations that confer risk for any specified common disease. The method does not depend upon genetic linkage of risk-conferring mutations to high frequency genetic markers such as single nucleotide polymorphisms. Instead, the sums of allelic mutation frequencies in case and control cohorts are determined and a statistical test is applied to discover if the difference in these sums is greater than would be expected by chance. A statistical model is presented that defines the ability of such tests to detect significant gene-disease relationships as a function of case and control cohort sizes and key confounding variables: zygosity and genicity, environmental risk factors, errors in diagnosis, limits to mutant detection, linkage of neutral and risk-conferring mutations, ethnic diversity in the general population and the expectation that among all exonic mutants in the human genome greater than 90% will be neutral with regard to any effect on disease risk. Means to test the null hypothesis for, and determine the statistical power of, each test are provided. For this "cohort allelic sums test" or "CAST", the statistical model and test are provided as an Excel program, CASTAT(c) at . Based on genetics, technology and statistics, a strategy of enumerating the mutant alleles carried in the exons and splice sites of the estimated approximately 25,000 human genes in case cohort samples of 10,000 persons for each of 100 common diseases is proposed and evaluated: A wide range of possible conditions of multi-allelic or mono-allelic and monogenic, multigenic or polygenic (including epistatic) risk are found to be detectable using the statistical criteria of 1 or 10 "false positive" gene associations approximately 25,000 gene-disease pair-wise trials and a statistical power of >0.8. Using estimates of the distribution of both neutral and gene-inactivating nondeleterious mutations in humans and the sensitivity of the test to multigenic or multicausal risk, it is estimated that about 80% of nullizygous, heterozygous and functionally dominant gene-common disease associations may be discovered. Limitations include relative insensitivity of CAST to about 60% of possible associations given homozygous (wild type) risk and, more rarely, other stochastic limits when the frequency of mutations in the case cohort approaches that of the control cohort and biases such as absence of genetic risk masked by risk derived from a shared cultural environment.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Genet
                plos
                plosgen
                PLoS Genetics
                Public Library of Science (San Francisco, USA )
                1553-7390
                1553-7404
                March 2011
                March 2011
                3 March 2011
                : 7
                : 3
                : e1001322
                Affiliations
                [1 ]The Center for Human Genetic Research, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [2 ]The Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America
                [3 ]Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
                [4 ]Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [5 ]Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
                [6 ]Department of Clinical Sciences Malmö, Diabetes and Cardiovascular Diseases, Genetic Epidemiology CRC, University Hospital Malmö, Malmö, Sweden
                [7 ]Cardiovascular Research Center, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [8 ]Department of Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
                [9 ]Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, United States of America
                [10 ]Department of Statistics, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
                Baylor College of Medicine, United States of America
                Author notes

                Conceived and designed the experiments: BMN MAR BFV DA BD SMP KR MJD. Performed the experiments: BMN MAR KR MJD. Analyzed the data: BMN MAR KR MJD. Contributed reagents/materials/analysis tools: BMN MAR MOM SK SMP KR MJD. Wrote the paper: BMN MAR KR MJD. Commented on the manuscript and aided in the development of the method: BFV BD. Commented on the manuscript and aided in the development of the methodological idea: DA. Contributed APOB data: MOM SK.

                ¶ These authors also contributed equally to this work.

                Article
                10-PLGE-RA-NV-2824R3
                10.1371/journal.pgen.1001322
                3048375
                21408211
                63e3eb66-9377-4156-ad6c-8b72d8cd28ee
                Neale et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 19 March 2010
                : 31 January 2011
                Page count
                Pages: 8
                Categories
                Research Article
                Computational Biology/Genomics
                Genetics and Genomics/Complex Traits
                Genetics and Genomics/Genetics of Disease
                Genetics and Genomics/Medical Genetics
                Mathematics/Statistics

                Genetics
                Genetics

                Comments

                Comment on this article