5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Multi-omics data are good resources for prognosis and survival prediction; however, these are difficult to integrate computationally. We introduce DeepProg, a novel ensemble framework of deep-learning and machine-learning approaches that robustly predicts patient survival subtypes using multi-omics data. It identifies two optimal survival subtypes in most cancers and yields significantly better risk-stratification than other multi-omics integration methods. DeepProg is highly predictive, exemplified by two liver cancer (C-index 0.73–0.80) and five breast cancer datasets (C-index 0.68–0.73). Pan-cancer analysis associates common genomic signatures in poor survival subtypes with extracellular matrix modeling, immune deregulation, and mitosis processes. DeepProg is freely available at https://github.com/lanagarmire/DeepProg

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s13073-021-00930-x.

          Related collections

          Most cited references54

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          xCell: digitally portraying the tissue cellular heterogeneity landscape

          Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training and give only a partial portrayal of the full cellular landscape. Here we present xCell, a novel gene signature-based method, and use it to infer 64 immune and stromal cell types. We harmonized 1822 pure human cell type transcriptomes from various sources and employed a curve fitting approach for linear comparison of cell types and introduced a novel spillover compensation technique for separating them. Using extensive in silico analyses and comparison to cytometry immunophenotyping, we show that xCell outperforms other methods. xCell is available at http://xCell.ucsf.edu/. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1349-1) contains supplementary material, which is available to authorized users.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.

            Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              LinkedOmics: analyzing multi-omics data within and across 32 cancer types

              Abstract The LinkedOmics database contains multi-omics data and clinical data for 32 cancer types and a total of 11 158 patients from The Cancer Genome Atlas (TCGA) project. It is also the first multi-omics database that integrates mass spectrometry (MS)-based global proteomics data generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) on selected TCGA tumor samples. In total, LinkedOmics has more than a billion data points. To allow comprehensive analysis of these data, we developed three analysis modules in the LinkedOmics web application. The LinkFinder module allows flexible exploration of associations between a molecular or clinical attribute of interest and all other attributes, providing the opportunity to analyze and visualize associations between billions of attribute pairs for each cancer cohort. The LinkCompare module enables easy comparison of the associations identified by LinkFinder, which is particularly useful in multi-omics and pan-cancer analyses. The LinkInterpreter module transforms identified associations into biological understanding through pathway and network analysis. Using five case studies, we demonstrate that LinkedOmics provides a unique platform for biologists and clinicians to access, analyze and compare cancer multi-omics data within and across tumor types. LinkedOmics is freely available at http://www.linkedomics.org.
                Bookmark

                Author and article information

                Contributors
                lgarmire@med.umich.edu
                Journal
                Genome Med
                Genome Med
                Genome Medicine
                BioMed Central (London )
                1756-994X
                14 July 2021
                14 July 2021
                2021
                : 13
                : 112
                Affiliations
                [1 ]GRID grid.249880.f, ISNI 0000 0004 0374 0039, Current address: Computational Sciences, , The Jackson Laboratory, ; 10 Discovery Drive Farmington, Farmington, Connecticut 06032 USA
                [2 ]GRID grid.410445.0, ISNI 0000 0001 2188 0957, University of Hawaii Cancer Center, ; Honolulu, HI 96813 USA
                [3 ]GRID grid.214458.e, ISNI 0000000086837370, Current address: Department of Computational Medicine and Bioinformatics, , University of Michigan, ; Ann Arbor, MI 48105 USA
                [4 ]GRID grid.59734.3c, ISNI 0000 0001 0670 2351, Current address: Department of Genetics and Genomic Sciences, , Icahn School of Medicine at Mount Sinai, ; 1 Gustave L. Levy Pl, New York, NY 10029 USA
                [5 ]GRID grid.25879.31, ISNI 0000 0004 1936 8972, Current address: Department of Biostatistics, Epidemiology and Informatics, , University of Pennsylvania, ; Philadelphia, PA 19104 USA
                Author information
                http://orcid.org/0000-0002-4680-9484
                Article
                930
                10.1186/s13073-021-00930-x
                8281595
                34261540
                42ed6b61-0be6-4fc6-88a8-a4a9acada195
                © The Author(s) 2021

                Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 11 January 2021
                : 25 June 2021
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/100000002, National Institutes of Health;
                Award ID: HD084633
                Funded by: FundRef http://dx.doi.org/10.13039/100000092, U.S. National Library of Medicine;
                Award ID: LM012907
                Award ID: LM012373
                Funded by: FundRef http://dx.doi.org/10.13039/100000066, National Institute of Environmental Health Sciences;
                Award ID: K01ES025434
                Funded by: FundRef http://dx.doi.org/10.13039/100000057, National Institute of General Medical Sciences;
                Award ID: GM103457
                Categories
                Method
                Custom metadata
                © The Author(s) 2021

                Molecular medicine
                survival,prognosis,multi-omics,cancer,ensemble learning,deep learning,machine learning
                Molecular medicine
                survival, prognosis, multi-omics, cancer, ensemble learning, deep learning, machine learning

                Comments

                Comment on this article