26
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Exploring patterns enriched in a dataset with contrastive principal component analysis

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used.

          Abstract

          Dimensionality reduction and visualization methods lack a principled way of comparing multiple datasets. Here, Abid et al. introduce contrastive PCA, which identifies low-dimensional structures enriched in one dataset compared to another and enables visualization of dataset-specific patterns.

          Related collections

          Most cited references20

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          mixOmics: An R package for ‘omics feature selection and multiple data integration

          The advent of high throughput technologies has led to a wealth of publicly available ‘omics data coming from different sources, such as transcriptomics, proteomics, metabolomics. Combining such large-scale biological data sets can lead to the discovery of important biological insights, provided that relevant information can be extracted in a holistic manner. Current statistical approaches have been focusing on identifying small subsets of molecules (a ‘molecular signature’) to explain or predict biological conditions, but mainly for a single type of ‘omics. In addition, commonly used methods are univariate and consider each biological feature independently. We introduce mixOmics, an R package dedicated to the multivariate analysis of biological data sets with a specific focus on data exploration, dimension reduction and visualisation. By adopting a systems biology approach, the toolkit provides a wide range of methods that statistically integrate several data sets at once to probe relationships between heterogeneous ‘omics data sets. Our recent methods extend Projection to Latent Structure (PLS) models for discriminant analysis, for data integration across multiple ‘omics data or across independent studies, and for the identification of molecular signatures. We illustrate our latest mixOmics integrative frameworks for the multivariate analyses of ‘omics data available from the package.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            What is principal component analysis?

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Visualizing data using ti-SNE

                Bookmark

                Author and article information

                Contributors
                jamesz@stanford.edu
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                30 May 2018
                30 May 2018
                2018
                : 9
                : 2134
                Affiliations
                [1 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Electrical Engineering, , Stanford University, ; 450 Serra Mall, Stanford, CA 94305 USA
                [2 ]ISNI 0000000419368956, GRID grid.168010.e, Department of Biomedical Data Science, , Stanford University, ; 450 Serra Mall, Stanford, CA 94305 USA
                [3 ]Chan-Zuckerberg Biohub, 499 Illinois St., San Francisco, CA 94158 USA
                Article
                4608
                10.1038/s41467-018-04608-8
                5976774
                29849030
                9749abc6-405f-43c5-a06b-dfc0a9e93a8c
                © The Author(s) 2018

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 5 December 2017
                : 25 April 2018
                Categories
                Article
                Custom metadata
                © The Author(s) 2018

                Uncategorized
                Uncategorized

                Comments

                Comment on this article