3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      NEMO: cancer subtyping by integration of partial multi-omic data

      research-article
      ,
      Bioinformatics
      Oxford University Press

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients.

          Results

          We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization.

          Availability and implementation

          Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: not found
          • Article: not found

          An iteration method for the solution of the eigenvalue problem of linear differential and integral operators

          C. Lanczos (1950)
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis.

            The molecular complexity of a tumor manifests itself at the genomic, epigenomic, transcriptomic and proteomic levels. Genomic profiling at these multiple levels should allow an integrated characterization of tumor etiology. However, there is a shortage of effective statistical and bioinformatic tools for truly integrative data analysis. The standard approach to integrative clustering is separate clustering followed by manual integration. A more statistically powerful approach would incorporate all data types simultaneously and generate a single integrated cluster assignment. We developed a joint latent variable model for integrative clustering. We call the resulting methodology iCluster. iCluster incorporates flexible modeling of the associations between different data types and the variance-covariance structure within data types in a single framework, while simultaneously reducing the dimensionality of the datasets. Likelihood-based inference is obtained through the Expectation-Maximization algorithm. We demonstrate the iCluster algorithm using two examples of joint analysis of copy number and gene expression data, one from breast cancer and one from lung cancer. In both cases, we identified subtypes characterized by concordant DNA copy number changes and gene expression as well as unique profiles specific to one or the other in a completely automated fashion. In addition, the algorithm discovers potentially novel subtypes by combining weak yet consistent alteration patterns across data types. R code to implement iCluster can be downloaded at http://www.mskcc.org/mskcc/html/85130.cfm
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Missing value estimation methods for DNA microarrays.

              Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clustering are not robust to missing data, and may lose effectiveness even with a few missing values. Methods for imputing missing data are needed, therefore, to minimize the effect of incomplete data sets on analyses, and to increase the range of data sets to which these algorithms can be applied. In this report, we investigate automated methods for estimating missing data. We present a comparative study of several methods for the estimation of missing values in gene microarray data. We implemented and evaluated three methods: a Singular Value Decomposition (SVD) based method (SVDimpute), weighted K-nearest neighbors (KNNimpute), and row average. We evaluated the methods using a variety of parameter settings and over different real data sets, and assessed the robustness of the imputation methods to the amount of missing data over the range of 1--20% missing values. We show that KNNimpute appears to provide a more robust and sensitive method for missing value estimation than SVDimpute, and both SVDimpute and KNNimpute surpass the commonly used row average method (as well as filling missing values with zeros). We report results of the comparative experiments and provide recommendations and tools for accurate estimation of missing microarray data under a variety of conditions.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                15 September 2019
                30 January 2019
                30 January 2019
                : 35
                : 18
                : 3348-3356
                Affiliations
                Blavatnik School of Computer Science, Tel Aviv University , Tel Aviv, Israel
                Author notes
                To whom correspondence should be addressed. E-mail: rshamir@ 123456tau.ac.il
                Article
                btz058
                10.1093/bioinformatics/btz058
                6748715
                30698637
                38c2194c-59ae-41c0-abc5-176bc883d073
                © The Author(s) 2019. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 30 September 2018
                : 23 December 2018
                : 25 January 2019
                Page count
                Pages: 9
                Funding
                Funded by: United States - Israel Binational Science Foundation 10.13039/501100001742
                Award ID: 2016694
                Funded by: BSF 10.13039/100001005
                Funded by: United States National Science Foundation
                Funded by: NSF 10.13039/100000001
                Funded by: Naomi Prawer Kadar Foundation
                Funded by: Bella Walter Memorial Fund of the Israel Cancer Association
                Funded by: Edmond J. Safra Center for Bioinformatics at Tel-Aviv University
                Categories
                Original Papers
                Gene Expression

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article