Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
95
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Fast, sensitive, and accurate integration of single cell data with Harmony

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The emerging diversity of single cell RNAseq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies. Here, real biological differences are interspersed with technical differences. We present Harmony, an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms. We show that Harmony requires dramatically fewer computational resources. It is the only currently available algorithm that makes the integration of ~10 6 cells feasible on a personal computer. We apply Harmony to PBMCs from datasets with large experimental differences, 5 studies of pancreatic islet cells, mouse embryogenesis datasets, and cross-modality spatial integration.

          Related collections

          Most cited references21

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor

            Single-cell RNA sequencing (scRNA-seq) is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A Single-Cell Transcriptomic Map of the Human and Mouse Pancreas Reveals Inter- and Intra-cell Population Structure.

              Although the function of the mammalian pancreas hinges on complex interactions of distinct cell types, gene expression profiles have primarily been described with bulk mixtures. Here we implemented a droplet-based, single-cell RNA-seq method to determine the transcriptomes of over 12,000 individual pancreatic cells from four human donors and two mouse strains. Cells could be divided into 15 clusters that matched previously characterized cell types: all endocrine cell types, including rare epsilon-cells; exocrine cell types; vascular cells; Schwann cells; quiescent and activated stellate cells; and four types of immune cells. We detected subpopulations of ductal cells with distinct expression profiles and validated their existence with immuno-histochemistry stains. Moreover, among human beta- cells, we detected heterogeneity in the regulation of genes relating to functional maturation and levels of ER stress. Finally, we deconvolved bulk gene expression samples using the single-cell data to detect disease-associated differential expression. Our dataset provides a resource for the discovery of novel cell type-specific transcription factors, signaling receptors, and medically relevant genes.
                Bookmark

                Author and article information

                Journal
                101215604
                32338
                Nat Methods
                Nat. Methods
                Nature methods
                1548-7091
                1548-7105
                7 September 2019
                18 November 2019
                December 2019
                18 May 2020
                : 16
                : 12
                : 1289-1296
                Affiliations
                [1 ]Center for Data Sciences, Brigham and Women's Hospital, Massachusetts, USA.
                [2 ]Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston.
                [3 ]Department of Biomedical Informatics, Harvard Medical School, Massachusetts, USA.
                [4 ]Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
                [5 ]Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, USA.
                [6 ]Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK.
                Author notes
                [* ]Correspondence to: Soumya Raychaudhuri, 77 Avenue Louis Pasteur, Harvard New Research Building, Suite 250D, Boston, MA 02446, USA. soumya@ 123456broadinstitute.org ; 617-525-4484 (tel); 617-525-4488 (fax)

                Author Contributions

                SR and IK conceived the research. IK led computational work under the guidance of SR, assisted by NM, PL, JF, and KS. All authors participated in interpretation and writing the manuscript.

                Article
                NIHMS1539299
                10.1038/s41592-019-0619-0
                6884693
                31740819
                e722cc28-b251-4a76-8112-343bc413c85c

                Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Categories
                Article

                Life sciences
                Life sciences

                Comments

                Comment on this article