15
views
0
recommends
+1 Recommend
2 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found

      The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection

      , , ,
      Genetics
      Genetics Society of America

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          <p class="first" id="d551100e175">Large-scale “omics” data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, <i>i.e.</i>, penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM ( <a data-untrusted="" href="http://www.ssg.uab.edu/bhglm/" id="d551100e180" target="xrefwindow">http://www.ssg.uab.edu/bhglm/</a>). </p>

          Most cited references28

          • Record: found
          • Abstract: not found
          • Article: not found

          Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Adaptive Lasso and Its Oracle Properties

            Hui Zou (2006)
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The Bayesian Lasso

                Bookmark

                Author and article information

                Journal
                Genetics
                Genetics
                Genetics Society of America
                0016-6731
                1943-2631
                January 03 2017
                January 2017
                January 2017
                October 31 2016
                : 205
                : 1
                : 77-88
                Article
                10.1534/genetics.116.192195
                5223525
                27799277
                e7e670c4-c54d-4f69-b714-6d9de47baf3a
                © 2016
                History

                Comments

                Comment on this article