11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          With recent advances in single-cell RNA sequencing, enormous transcriptome datasets have been generated. These datasets have furthered our understanding of cellular heterogeneity and its underlying mechanisms in homogeneous populations. Single-cell RNA sequencing (scRNA-seq) data clustering can group cells belonging to the same cell type based on patterns embedded in gene expression. However, scRNA-seq data are high-dimensional, noisy, and sparse, owing to the limitation of existing scRNA-seq technologies. Traditional clustering methods are not effective and efficient for high-dimensional and sparse matrix computations. Therefore, several dimension reduction methods have been introduced. To validate a reliable and standard research routine, we conducted a comprehensive review and evaluation of four classical dimension reduction methods and five clustering models. Four experiments were progressively performed on two large scRNA-seq datasets using 20 models. Results showed that the feature selection method contributed positively to high-dimensional and sparse scRNA-seq data. Moreover, feature-extraction methods were able to promote clustering performance, although this was not eternally immutable. Independent component analysis (ICA) performed well in those small compressed feature spaces, whereas principal component analysis was steadier than all the other feature-extraction methods. In addition, ICA was not ideal for fuzzy C-means clustering in scRNA-seq data analysis. K-means clustering was combined with feature-extraction methods to achieve good results.

          Related collections

          Most cited references39

          • Record: found
          • Abstract: not found
          • Article: not found

          FCM: The fuzzy c-means clustering algorithm

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing.

            Mammalian pre-implantation development is a complex process involving dramatic changes in the transcriptional architecture. We report here a comprehensive analysis of transcriptome dynamics from oocyte to morula in both human and mouse embryos, using single-cell RNA sequencing. Based on single-nucleotide variants in human blastomere messenger RNAs and paternal-specific single-nucleotide polymorphisms, we identify novel stage-specific monoallelic expression patterns for a significant portion of polymorphic gene transcripts (25 to 53%). By weighted gene co-expression network analysis, we find that each developmental stage can be delineated concisely by a small number of functional modules of co-expressed genes. This result indicates a sequential order of transcriptional changes in pathways of cell cycle, gene regulation, translation and metabolism, acting in a step-wise fashion from cleavage to morula. Cross-species comparisons with mouse pre-implantation embryos reveal that the majority of human stage-specific modules (7 out of 9) are notably preserved, but developmental specificity and timing differ between human and mouse. Furthermore, we identify conserved key members (or hub genes) of the human and mouse networks. These genes represent novel candidates that are likely to be key in driving mammalian pre-implantation development. Together, the results provide a valuable resource to dissect gene regulatory mechanisms underlying progressive development of early mammalian embryos.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Single-Cell RNA-Seq with Waterfall Reveals Molecular Cascades underlying Adult Neurogenesis.

              Somatic stem cells contribute to tissue ontogenesis, homeostasis, and regeneration through sequential processes. Systematic molecular analysis of stem cell behavior is challenging because classic approaches cannot resolve cellular heterogeneity or capture developmental dynamics. Here we provide a comprehensive resource of single-cell transcriptomes of adult hippocampal quiescent neural stem cells (qNSCs) and their immediate progeny. We further developed Waterfall, a bioinformatic pipeline, to statistically quantify singe-cell gene expression along a de novo reconstructed continuous developmental trajectory. Our study reveals molecular signatures of adult qNSCs, characterized by active niche signaling integration and low protein translation capacity. Our analyses further delineate molecular cascades underlying qNSC activation and neurogenesis initiation, exemplified by decreased extrinsic signaling capacity, primed translational machinery, and regulatory switches in transcription factors, metabolism, and energy sources. Our study reveals the molecular continuum underlying adult neurogenesis and illustrates how Waterfall can be used for single-cell omics analyses of various continuous biological processes.
                Bookmark

                Author and article information

                Journal
                Int J Mol Sci
                Int J Mol Sci
                ijms
                International Journal of Molecular Sciences
                MDPI
                1422-0067
                22 March 2020
                March 2020
                : 21
                : 6
                : 2181
                Affiliations
                [1 ]Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China liusf@ 123456jlu.edu.cn (S.L.); haozhang17@ 123456mails.jlu.edu.cn (H.Z.); guanrenchu@ 123456jlu.edu.cn (R.G.); ffzhou@ 123456jlu.edu.cn (F.Z.); ycliang@ 123456jlu.edu.cn (Y.L.)
                [2 ]Zhuhai Sub Laboratory of Key Laboratory of Symbolic Computation and Knowledge Engineering of the Ministry of Education, Zhuhai College of Jilin University, Zhuhai 519041, China
                [3 ]Joint Bioinformatics Program, University of Arkansas Little Rock George Washington Donaghey College of Engineering & IT and University of Arkansas for Medical Sciences, Little Rock, AR 72204, USA; dxli@ 123456ualr.edu
                Author notes
                [* ]Correspondence: fengxy@ 123456jlu.edu.cn ; Tel.: 86-13944088266
                Author information
                https://orcid.org/0000-0002-7162-7826
                https://orcid.org/0000-0002-8108-6007
                https://orcid.org/0000-0003-3954-1333
                Article
                ijms-21-02181
                10.3390/ijms21062181
                7139673
                32235704
                d5a46f36-5bba-42e2-85a3-3afe6a8c22ec
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 20 January 2020
                : 20 March 2020
                Categories
                Article

                Molecular biology
                single-cell rna sequencing,dimensionality reduction,clustering algorithm
                Molecular biology
                single-cell rna sequencing, dimensionality reduction, clustering algorithm

                Comments

                Comment on this article