13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SHARP: hyperfast and accurate processing of single-cell RNA-seq data via ensemble random projection

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          To process large-scale single-cell RNA-sequencing (scRNA-seq) data effectively without excessive distortion during dimension reduction, we present SHARP, an ensemble random projection-based algorithm that is scalable to clustering 10 million cells. Comprehensive benchmarking tests on 17 public scRNA-seq data sets show that SHARP outperforms existing methods in terms of speed and accuracy. Particularly, for large-size data sets (more than 40,000 cells), SHARP runs faster than other competitors while maintaining high clustering accuracy and robustness. To the best of our knowledge, SHARP is the only R-based tool that is scalable to clustering scRNA-seq data with 10 million cells.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Integrating single-cell transcriptomic data across different conditions, technologies, and species

          Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets.

            Cells, the basic units of biological structure and function, vary broadly in type and state. Single-cell genomics can characterize cell identity and function, but limitations of ease and scale have prevented its broad application. Here we describe Drop-seq, a strategy for quickly profiling thousands of individual cells by separating them into nanoliter-sized aqueous droplets, associating a different barcode with each cell's RNAs, and sequencing them all together. Drop-seq analyzes mRNA transcripts from thousands of individual cells simultaneously while remembering transcripts' cell of origin. We analyzed transcriptomes from 44,808 mouse retinal cells and identified 39 transcriptionally distinct cell populations, creating a molecular atlas of gene expression for known retinal cell classes and novel candidate cell subtypes. Drop-seq will accelerate biological discovery by enabling routine transcriptional profiling at single-cell resolution. VIDEO ABSTRACT.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Dimensionality reduction for visualizing single-cell data using UMAP

              Advances in single-cell technologies have enabled high-resolution dissection of tissue composition. Several tools for dimensionality reduction are available to analyze the large number of parameters generated in single-cell studies. Recently, a nonlinear dimensionality-reduction technique, uniform manifold approximation and projection (UMAP), was developed for the analysis of any type of high-dimensional data. Here we apply it to biological data, using three well-characterized mass cytometry and single-cell RNA sequencing datasets. Comparing the performance of UMAP with five other tools, we find that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters. The work highlights the use of UMAP for improved visualization and interpretation of single-cell data.
                Bookmark

                Author and article information

                Journal
                Genome Res
                Genome Res
                genome
                genome
                GENOME
                Genome Research
                Cold Spring Harbor Laboratory Press
                1088-9051
                1549-5469
                February 2020
                : 30
                : 2
                : 205-213
                Affiliations
                [1 ]Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
                [2 ]Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA;
                [3 ]Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA;
                [4 ]Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen North, Denmark;
                [5 ]Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen North, Denmark
                Author notes
                Corresponding author: kyoung.won@ 123456bric.ku.dk
                Author information
                http://orcid.org/0000-0003-0661-2684
                Article
                9509184
                10.1101/gr.254557.119
                7050522
                31992615
                c33e3914-0b40-4e87-a361-1a9870c15f77
                © 2020 Wan et al.; Published by Cold Spring Harbor Laboratory Press

                This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

                History
                : 10 July 2019
                : 23 January 2020
                Page count
                Pages: 9
                Funding
                Funded by: National Institutes of Health , open-funder-registry 10.13039/100000002;
                Funded by: National Institute of Diabetes and Digestive and Kidney Diseases , open-funder-registry 10.13039/100000062;
                Award ID: R01 DK106027
                Funded by: Novo Nordisk Foundation , open-funder-registry 10.13039/501100009708;
                Award ID: NNF17CC0027852
                Categories
                Method

                Comments

                Comment on this article