17
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A comparative analysis of family-based and population-based association tests using whole genome sequence data

      abstract
      1 , 4 , 1 , , 2 , 3 , 1 , 2 , 1
      BMC Proceedings
      BioMed Central
      Genetic Analysis Workshop 18
      13-17 October 2012

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies, numerous methods have been developed to assess the impact of both common and rare variants on disease traits. In this report, whole genome sequencing data from Genetic Analysis Workshop 18 was used to compare the power of several methods, considering both family-based and population-based designs, to detect association with variants in the MAP4 gene region and on chromosome 3 with blood pressure. To prioritize variants across the genome for testing, variants were first functionally assessed using prediction algorithms and expression quantitative trait loci (eQTLs) data. Four set-based tests in the family-based association tests (FBAT) framework--FBAT-v, FBAT-lmm, FBAT-m, and FBAT-l--were used to analyze 20 pedigrees, and 2 variance component tests, sequence kernel association test (SKAT) and genome-wide complex trait analysis (GCTA), were used with 142 unrelated individuals in the sample. Both set-based and variance-component-based tests had high power and an adequate type I error rate. Of the various FBATs, FBAT-l demonstrated superior performance, indicating the potential for it to be used in rare-variant analysis. The updated FBAT package is available at: http://www.hsph.harvard.edu/fbat/.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: found
          • Article: not found

          Semiparametric regression of multidimensional genetic pathway data: least-squares kernel machines and linear mixed models.

          We consider a semiparametric regression model that relates a normal outcome to covariates and a genetic pathway, where the covariate effects are modeled parametrically and the pathway effect of multiple gene expressions is modeled parametrically or nonparametrically using least-squares kernel machines (LSKMs). This unified framework allows a flexible function for the joint effect of multiple genes within a pathway by specifying a kernel function and allows for the possibility that each gene expression effect might be nonlinear and the genes within the same pathway are likely to interact with each other in a complicated way. This semiparametric model also makes it possible to test for the overall genetic pathway effect. We show that the LSKM semiparametric regression can be formulated using a linear mixed model. Estimation and inference hence can proceed within the linear mixed model framework using standard mixed model software. Both the regression coefficients of the covariate effects and the LSKM estimator of the genetic pathway effect can be obtained using the best linear unbiased predictor in the corresponding linear mixed model formulation. The smoothing parameter and the kernel parameter can be estimated as variance components using restricted maximum likelihood. A score test is developed to test for the genetic pathway effect. Model/variable selection within the LSKM framework is discussed. The methods are illustrated using a prostate cancer data set and evaluated using simulations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Rare Variant Analysis for Family-Based Design

            Genome-wide association studies have been able to identify disease associations with many common variants; however most of the estimated genetic contribution explained by these variants appears to be very modest. Rare variants are thought to have larger effect sizes compared to common SNPs but effects of rare variants cannot be tested in the GWAS setting. Here we propose a novel method to test for association of rare variants obtained by sequencing in family-based samples by collapsing the standard family-based association test (FBAT) statistic over a region of interest. We also propose a suitable weighting scheme so that low frequency SNPs that may be enriched in functional variants can be upweighted compared to common variants. Using simulations we show that the family-based methods perform at par with the population-based methods under no population stratification. By construction, family-based tests are completely robust to population stratification; we show that our proposed methods remain valid even when population stratification is present.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An efficient family-based association test using multiple markers.

              In genetic association studies, multiple markers are usually employed to cover a genomic region of interest for localizing a trait locus. In this report, we propose a novel multi-marker family-based association test (T(LC)) that linearly combines the single-marker test statistics using data-driven weights. We examine the type-I error rate in a numerical study and compare its power to identify a common trait locus using tag single nucleotide polymorphisms (SNPs) within the same haplotype block that the trait locus resides with three competing tests including a global haplotype test (T(H)), a multi-marker test similar to the Hotelling-T(2) test for the population-based data (T(MM)), and a single-marker test with Bonferroni's correction for multiple testing (T(B)). The type-I error rate of T(LC) is well maintained in our numeric study. In all the scenarios we examined, T(LC) is the most powerful, followed by T(B). T(MM) and T(H) are the poorest. T(H) and T(MM) have essentially the same power when parents are available. However, when both parents are missing, T(MM) is substantially more powerful than T(H). We also apply this new test on a data set from a previous association study on nicotine dependence. (c) 2006 Wiley-Liss, Inc.
                Bookmark

                Author and article information

                Contributors
                Conference
                BMC Proc
                BMC Proc
                BMC Proceedings
                BioMed Central
                1753-6561
                2014
                17 June 2014
                : 8
                : Suppl 1
                : S33
                Affiliations
                [1 ]Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
                [2 ]Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
                [3 ]Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
                [4 ]Division of Epidemiology and Biostatistics, College of Public Health, University of Arizona, Tucson, AZ 85724, USA
                Article
                1753-6561-8-S1-S33
                10.1186/1753-6561-8-S1-S33
                4143682
                25519381
                2d7def40-78c2-43dd-a9de-950f2a916232
                Copyright © 2014 Zhou et al.; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                Genetic Analysis Workshop 18
                Stevenson, WA, USA
                13-17 October 2012
                History
                Categories
                Proceedings

                Medicine
                Medicine

                Comments

                Comment on this article