19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

      research-article
      ,
      BMC Bioinformatics
      BioMed Central
      Gene-gene interaction, Lasso, Overlapping group, Survival prediction

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The development of a disease is a complex process that may result from joint effects of multiple genes. In this article, we propose the overlapping group screening (OGS) approach to determining active genes and gene-gene interactions incorporating prior pathway information. The OGS method is developed to overcome the challenges in genome-wide data analysis that the number of the genes and gene-gene interactions is far greater than the sample size, and the pathways generally overlap with one another. The OGS method is further proposed for patients’ survival prediction based on gene expression data.

          Results

          Simulation studies demonstrate that the performance of the OGS approach in identifying the true main and interaction effects is good and the survival prediction accuracy of OGS with the Lasso penalty is better than the ordinary Lasso method. In real data analysis, we identify several significant genes and/or epistasis interactions that are associated with clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) and non-small-cell lung cancer (NSCLC) by utilizing prior pathway information from the KEGG pathway and the GO biological process databases, respectively.

          Conclusions

          The OGS approach is useful for selecting important genes and epistasis interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The OGS approach is generally applicable to various types of outcome data (quantitative, qualitative, censored event time data) and regression models (e.g. linear, logistic, and Cox’s regression models).

          Electronic supplementary material

          The online version of this article (10.1186/s12859-018-2372-2) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: not found

          Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.

          We introduce a pathwise algorithm for the Cox proportional hazards model, regularized by convex combinations of ℓ1 and ℓ2 penalties (elastic net). Our algorithm fits via cyclical coordinate descent, and employs warm starts to find a solution along a regularization path. We demonstrate the efficacy of our algorithm on real and simulated data sets, and find considerable speedup between our algorithm and competing methods.
            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Group lasso with overlap and graph lasso

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Predicting survival from microarray data--a comparative study.

              Survival prediction from gene expression data and other high-dimensional genomic data has been subject to much research during the last years. These kinds of data are associated with the methodological problem of having many more gene expression values than individuals. In addition, the responses are censored survival times. Most of the proposed methods handle this by using Cox's proportional hazards model and obtain parameter estimates by some dimension reduction or parameter shrinkage estimation technique. Using three well-known microarray gene expression data sets, we compare the prediction performance of seven such methods: univariate selection, forward stepwise selection, principal components regression (PCR), supervised principal components regression, partial least squares regression (PLS), ridge regression and the lasso. Statistical learning from subsets should be repeated several times in order to get a fair comparison between methods. Methods using coefficient shrinkage or linear combinations of the gene expression values have much better performance than the simple variable selection methods. For our data sets, ridge regression has the overall best performance. Matlab and R code for the prediction methods are available at http://www.med.uio.no/imb/stat/bmms/software/microsurv/.
                Bookmark

                Author and article information

                Contributors
                jhwang@stat.sinica.edu.tw
                yhchen@stat.sinica.edu.tw
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                21 September 2018
                21 September 2018
                2018
                : 19
                : 335
                Affiliations
                ISNI 0000 0001 2287 1366, GRID grid.28665.3f, Institute of Statistical Science, , Academia Sinica, ; Nankang, Taipei, Taiwan
                Author information
                http://orcid.org/0000-0003-4038-9439
                Article
                2372
                10.1186/s12859-018-2372-2
                6150983
                30241463
                4befcd4d-4598-4025-aabb-1803dc57115d
                © The Author(s). 2018

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 29 March 2018
                : 12 September 2018
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100004663, Ministry of Science and Technology, Taiwan;
                Award ID: MOST 104-2118-M-001-006-MY3
                Award Recipient :
                Categories
                Methodology Article
                Custom metadata
                © The Author(s) 2018

                Bioinformatics & Computational biology
                gene-gene interaction,lasso,overlapping group,survival prediction

                Comments

                Comment on this article