7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      BGWAS: Bayesian variable selection in linear mixed models with nonlocal priors for genome-wide association studies

      research-article
      , ,
      BMC Bioinformatics
      BioMed Central
      GWAS, Bayesian, Model selection

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Genome-wide association studies (GWAS) seek to identify single nucleotide polymorphisms (SNPs) that cause observed phenotypes. However, with highly correlated SNPs, correlated observations, and the number of SNPs being two orders of magnitude larger than the number of observations, GWAS procedures often suffer from high false positive rates.

          Results

          We propose BGWAS, a novel Bayesian variable selection method based on nonlocal priors for linear mixed models specifically tailored for genome-wide association studies. Our proposed method BGWAS uses a novel nonlocal prior for linear mixed models (LMMs). BGWAS has two steps: screening and model selection. The screening step scans through all the SNPs fitting one LMM for each SNP and then uses Bayesian false discovery control to select a set of candidate SNPs. After that, a model selection step searches through the space of LMMs that may have any number of SNPs from the candidate set. A simulation study shows that, when compared to popular GWAS procedures, BGWAS greatly reduces false positives while maintaining the same ability to detect true positive SNPs. We show the utility and flexibility of BGWAS with two case studies: a case study on salt stress in plants, and a case study on alcohol use disorder.

          Conclusions

          BGWAS maintains and in some cases increases the recall of true SNPs while drastically lowering the number of false positives compared to popular SMA procedures.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s12859-023-05316-x.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Genome-wide Efficient Mixed Model Analysis for Association Studies

          Linear mixed models have attracted considerable recent attention as a powerful and effective tool for accounting for population stratification and relatedness in genetic association tests. However, existing methods for exact computation of standard test statistics are computationally impractical for even moderate-sized genome-wide association studies. To deal with this several approximate methods have been proposed. Here, we present an efficient exact method that makes these approximations unnecessary in many settings. This method is roughly n times faster than the widely-used exact method EMMA, where n is the sample size, making exact genome-wide association analysis computationally practical for large numbers of individuals.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A unified mixed-model method for association mapping that accounts for multiple levels of relatedness.

            As population structure can result in spurious associations, it has constrained the use of association studies in human and plant genetics. Association mapping, however, holds great promise if true signals of functional association can be separated from the vast number of false signals generated by population structure. We have developed a unified mixed-model approach to account for multiple levels of relatedness simultaneously as detected by random genetic markers. We applied this new approach to two samples: a family-based sample of 14 human families, for quantitative gene expression dissection, and a sample of 277 diverse maize inbred lines with complex familial relationships and population structure, for quantitative trait dissection. Our method demonstrates improved control of both type I and type II error rates over other methods. As this new method crosses the boundary between family-based and structured association samples, it provides a powerful complement to currently available methods for association mapping.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Efficient Bayesian mixed model analysis increases association power in large cohorts

              Linear mixed models are a powerful statistical tool for identifying genetic associations and avoiding confounding. However, existing methods are computationally intractable in large cohorts, and may not optimize power. All existing methods require time cost O(MN2) (where N = #samples and M = #SNPs) and implicitly assume an infinitesimal genetic architecture in which effect sizes are normally distributed, which can limit power. Here, we present a far more efficient mixed model association method, BOLT-LMM, which requires only a small number of O(MN)-time iterations and increases power by modeling more realistic, non-infinitesimal genetic architectures via a Bayesian mixture prior on marker effect sizes. We applied BOLT-LMM to nine quantitative traits in 23,294 samples from the Women’s Genome Health Study (WGHS) and observed significant increases in power, consistent with simulations. Theory and simulations show that the boost in power increases with cohort size, making BOLT-LMM appealing for GWAS in large cohorts.
                Bookmark

                Author and article information

                Contributors
                jwilliams@vt.edu
                xshuangshuang@vt.edu
                marf@vt.edu
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                11 May 2023
                11 May 2023
                2023
                : 24
                : 194
                Affiliations
                GRID grid.438526.e, ISNI 0000 0001 0694 4940, Department of Statistics, , Virginia Tech, ; Blacksburg, 24061 USA
                Article
                5316
                10.1186/s12859-023-05316-x
                10176706
                37170185
                d2445572-5b5b-4bc7-b956-e9e2cd8efc71
                © The Author(s) 2023

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 23 December 2022
                : 30 April 2023
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000001, National Science Foundation;
                Award ID: DMS 1853549
                Categories
                Research
                Custom metadata
                © BioMed Central Ltd., part of Springer Nature 2023

                Bioinformatics & Computational biology
                gwas,bayesian,model selection
                Bioinformatics & Computational biology
                gwas, bayesian, model selection

                Comments

                Comment on this article