Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Annotation of cell-types is a critical step in the analysis of single-cell RNA sequencing (scRNA-seq) data that allows the study of heterogeneity across multiple cell populations. Currently, this is most commonly done using unsupervised clustering algorithms, which project single-cell expression data into a lower dimensional space and then cluster cells based on their distances from each other. However, as these methods do not use reference datasets, they can only achieve a rough classification of cell-types, and it is difficult to improve the recognition accuracy further. To effectively solve this issue, we propose a novel supervised annotation method, scDeepInsight. The scDeepInsight method is capable of performing manifold assignments. It is competent in executing data integration through batch normalization, performing supervised training on the reference dataset, doing outlier detection and annotating cell-types on query datasets. Moreover, it can help identify active genes or marker genes related to cell-types. The training of the scDeepInsight model is performed in a unique way. Tabular scRNA-seq data are first converted to corresponding images through the DeepInsight methodology. DeepInsight can create a trainable image transformer to convert non-image RNA data to images by comprehensively comparing interrelationships among multiple genes. Subsequently, the converted images are fed into convolutional neural networks such as EfficientNet-b3. This enables automatic feature extraction to identify the cell-types of scRNA-seq samples. We benchmarked scDeepInsight with six other mainstream cell annotation methods. The average accuracy rate of scDeepInsight reached 87.5%, which is more than 7% higher compared with the state-of-the-art methods.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Integrating single-cell transcriptomic data across different conditions, technologies, and species

          Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Integrated analysis of multimodal single-cell data

            Summary The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Massively parallel digital transcriptional profiling of single cells

              Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We describe a droplet-based system that enables 3′ mRNA counting of tens of thousands of single cells per sample. Cell encapsulation, of up to 8 samples at a time, takes place in ∼6 min, with ∼50% cell capture efficiency. To demonstrate the system's technical performance, we collected transcriptome data from ∼250k single cells across 29 samples. We validated the sensitivity of the system and its ability to detect rare populations using cell lines and synthetic RNAs. We profiled 68k peripheral blood mononuclear cells to demonstrate the system's ability to characterize large immune populations. Finally, we used sequence variation in the transcriptome data to determine host and donor chimerism at single-cell resolution from bone marrow mononuclear cells isolated from transplant patients.
                Bookmark

                Author and article information

                Contributors
                Journal
                Brief Bioinform
                Brief Bioinform
                bib
                Briefings in Bioinformatics
                Oxford University Press
                1467-5463
                1477-4054
                September 2023
                31 July 2023
                31 July 2023
                : 24
                : 5
                : bbad266
                Affiliations
                Laboratory for Medical Science Mathematics , Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo , Japan
                Laboratory for Medical Science Mathematics , Department of Biological Sciences, School of Science, The University of Tokyo , Japan
                Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences , Japan
                Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences , Japan
                Laboratory for Medical Science Mathematics , Department of Biological Sciences, School of Science, The University of Tokyo , Japan
                Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences , Japan
                Institute for Integrated and Intelligent Systems, Griffith University , Australia
                Laboratory for Medical Science Mathematics , Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo , Japan
                Laboratory for Medical Science Mathematics , Department of Biological Sciences, School of Science, The University of Tokyo , Japan
                Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences , Japan
                Author notes
                Corresponding authors. Alok Sharma, Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan; Institute for Integrated and Intelligent Systems, Griffith University, QLD-4111, Australia; Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo 113-0033, Japan.  E-mail: alok.fj@ 123456gmail.com ; Tatsuhiko Tsunoda, Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo 113-0033, Japan; Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8562, Japan; Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan. E-mail: tsunoda@ 123456bs.s.u-tokyo.ac.jp

                Alok Sharma and Tatsuhiko Tsunoda are co-last authors.

                Author information
                https://orcid.org/0000-0001-7095-7332
                https://orcid.org/0000-0002-7668-3501
                https://orcid.org/0000-0002-5439-7918
                Article
                bbad266
                10.1093/bib/bbad266
                10516353
                37523217
                1338ba50-5a62-4c16-a17d-558bb2f635da
                © The Author(s) 2023. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 9 March 2023
                : 12 June 2023
                : 4 July 2023
                Page count
                Pages: 12
                Funding
                Funded by: Japan Society for the Promotion of Science, DOI 10.13039/501100001691;
                Award ID: JP20H03240
                Funded by: Japan Science and Technology Agency, DOI 10.13039/501100002241;
                Award ID: JPMJCR2231
                Categories
                Problem Solving Protocol
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                single-cell rna sequencing,deep learning,cell annotation,transformers

                Comments

                Comment on this article