Single-cell analyses reveal early thymic progenitors and pre-B cells in zebrafish

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Single-cell RNA sequencing of zebrafish hematopoiesis provides new insights into zebrafish B- and T-cell development and cellular diversity within the primary lymphoid organs (kidney marrow and thymus). This work advances the current understanding of zebrafish immunology and will facilitate future genetic and developmental studies.

Abstract

The zebrafish has proven to be a valuable model organism for studying hematopoiesis, but relatively little is known about zebrafish immune cell development and functional diversity. Elucidating key aspects of zebrafish lymphocyte development and exploring the breadth of effector functions would provide valuable insight into the evolution of adaptive immunity. We performed single-cell RNA sequencing on ∼70,000 cells from the zebrafish marrow and thymus to establish a gene expression map of zebrafish immune cell development. We uncovered rich cellular diversity in the juvenile and adult zebrafish thymus, elucidated B- and T-cell developmental trajectories, and transcriptionally characterized subsets of hematopoietic stem and progenitor cells and early thymic progenitors. Our analysis permitted the identification of two dendritic-like cell populations and provided evidence in support of the existence of a pre-B cell state. Our results provide critical insights into the landscape of zebrafish immunology and offer a foundation for cellular and genetic studies.

Graphical Abstract

Related collections

Most cited references 206

Record: found
Abstract: found
Article: not found

Comprehensive Integration of Single-Cell Data

Tim Stuart, Andrew Butler, Paul Hoffman … (2019)

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.

0 comments Cited 5240 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Metascape provides a biologist-oriented resource for the analysis of systems-level datasets

Yingyao Zhou, Bin Zhou, Lars Pache … (2019)

A critical component in the interpretation of systems-level studies is the inference of enriched biological pathways and protein complexes contained within OMICs datasets. Successful analysis requires the integration of a broad set of current biological databases and the application of a robust analytical pipeline to produce readily interpretable results. Metascape is a web-based portal designed to provide a comprehensive gene list annotation and analysis resource for experimental biologists. In terms of design features, Metascape combines functional enrichment, interactome analysis, gene annotation, and membership search to leverage over 40 independent knowledgebases within one integrated portal. Additionally, it facilitates comparative analyses of datasets across multiple independent and orthogonal experiments. Metascape provides a significantly simplified user experience through a one-click Express Analysis interface to generate interpretable outputs. Taken together, Metascape is an effective and efficient tool for experimental biologists to comprehensively analyze and interpret OMICs-based studies in the big data era.

0 comments Cited 4449 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions

Cole Trapnell, Davide Cacchiarelli, Jonna L Grimsby … (2014)

Single-cell expression profiling by RNA-Seq promises to exploit cell-to-cell variation in gene expression to reveal regulatory circuitry governing cell differentiation and other biological processes. Here, we describe Monocle, a novel unsupervised algorithm for ordering cells by progress through differentiation that dramatically increases temporal resolution of expression measurements in a model of skeletal muscle differentiation. This reordering unmasks switch-like changes in expression of key regulatory factors, reveals sequentially organized waves of gene regulation, and exposes novel regulators of cell differentiation. A loss-of function screen revealed that many of these inhibitors act through regulatory elements also used by pro-myogenic factors to activate downstream genes. This study demonstrates that single-cell expression analysis by Monocle can uncover novel regulatory interactions governing differentiation. Cell differentiation is governed by a vast and complex gene regulatory program. During differentiation, each cell makes fate decisions independently by integrating a wide array of signals from other cells, executing a complex choreography of gene regulatory changes. Recently, several studies carried out at single-cell resolution have revealed high cell-to-cell variation in most genes during differentiation 1–5 , even among key developmental regulators. Although high variability complicates analysis of such experiments 6 , it might define biological progression between cellular states, revealing regulatory modules of genes that co-vary in expression across individual cells 7 . Prior studies have used approaches from computational geometry 8,9 and supervised machine learning 10 to order bulk cell populations from time-series microarray experiments by progress through a biological process. Applying this concept to order individual cells could expose fine-grained gene expression dynamics as they differentiate. We have developed Monocle, an algorithm that harnesses single cell variation to sort cells in “pseudo time” according to progress through differentiation. Applying Monocle to the classic model of myogenesis unveiled dynamics at unprecedented resolution and exposed novel regulatory factors. Skeletal myoblasts undergo a well-characterized sequence of morphological and transcriptional changes during differentiation 11 . Global expression and epigenetic profiles have reinforced the view that a small cohort of transcription factors (e.g. MYOD, MYOG, MRF4, and MYF5) orchestrates these changes 12 . However, efforts to expand this set of factors and map the broader myogenic regulatory network have been hampered by the temporal resolution of global expression measurements, with thousands of genes following a limited number of coarse kinetic trends 13 . Single-cell measurements of markers of myogenesis have made clear that cells do not progress through differentiation in synchrony. A population of cells captured at the same time may thus cover a range of distinct intermediate differentiation states. Drawing conclusions from a group of individuals based on the properties of their average is a hazardous practice because the average can mask important trends among the individuals, resulting in phenomena such as Simpson's paradox 14 . Experimental synchronization or stringent isolation of myogenic precursors is often challenging and dramatically alters differentiation kinetics. We hypothesized that capturing complete expression profiles of individual cells might avoid these problems and dramatically increase temporal resolution in global transcriptome dynamics. In essence, a single-cell RNA-Seq experiment might constitute a time-series, with each cell representing a distinct time point along a continuum. To test this hypothesis we investigated the single cell transcriptome dynamics during myogenesis. We expanded primary human myoblasts under high mitogen conditions (GM), and then induced differentiation by switching to low-mitogen media (DM). We then captured 50–100 cells at each of four time points following serum switch using the Fluidigm C1 microfluidic system. RNA from each cell was isolated and used to construct mRNA-Seq libraries, which were then sequenced to a depth of ~4 million reads per library, resulting in a complete gene expression profile for each cell (Fig 1a, S1). Averaging expression profiles of cells collected at the same time correlated well with the corresponding bulk RNA-Seq libraries, and moderately expressed genes were detectable (≥ 1 FPKM) in a majority of individual cells (Fig 1b, S2, S3). However, markers of mature myocytes were present at all time points following serum switch, and many other genes showed similar temporal heterogeneity (Fig 1c) We speculated that the high variability in cell-to-cell gene expression levels was due to unsynchronized differentiation, with myoblasts, intermediate myocytes, and mature myotubes residing in the same well concurrently. Indeed, large, multinucleated MYH2+ cells were abundant after 72 hours in DM, but these cells were present at lower frequency even at 24 hours (Fig 1c). We reasoned that informatically ordering the cells by their progress through differentiation, rather than by the time they were collected, would distinguish genes activated early in differentiation from those activated later. To this end, we developed a novel unsupervised algorithm, Monocle, which re-ordered the cells to maximize the transcriptional similarity between successive pairs (Fig 2a). The algorithm first represents the expression profile of each cell as a point in a high-dimensional Euclidean space, with one dimension for each gene. Second, it reduces the dimensionality of this space using Independent Component Analysis 15 . Third, Monocle constructs a minimum spanning tree (MST) on the cells, an approach now commonly used in other single-cell settings, such as flow or mass cytometry 16,17 . Fourth, the algorithm finds the longest path through the MST, corresponding to the longest sequence of transcriptionally similar cells. Finally, Monocle uses this sequence to produce a “trajectory” of an individual cell's progress through differentiation. Progress along a differentiation trajectory is measured in “pseudo-time”: the total transcriptional change a cell undergoes as it differentiates. This strategy is derived from a prior algorithm for temporally ordering microarray samples 8 , but extends it to allow for multiple cell fates stemming from a single progenitor cell type. As cells progress, they may diverge along two or more separate paths. After Monocle finds the longest sequence of similar cells, it examines cells not along this path to find alternative trajectories through the MST. These sub-trajectories are ordered and connected to the main trajectory, and each cell is annotated with both a trajectory and a pseudo-time value. Monocle thus orders cells by progress through differentiation and can reconstruct branched biological processes, which might arise when a precursor cell makes cell fate decisions that govern the generation of multiple subsequent lineages. Importantly, Monocle is unsupervised and needs no prior knowledge of specific genes that distinguish cell fates, and is thus suitable for studying a wide array of dynamic biological processes. Monocle decomposed myoblast differentiation into a two-phase trajectory and isolated a branch of non-differentiating cells (Fig 2b). The first phase of the trajectory was primarily composed of cells collected under high-mitogen conditions and which expressed markers of actively proliferating cells such as CDK1, while the second mainly consisted of cells collected at 24, 48, or 72 hours following serum switch. Cells in the second phase were positive for markers of muscle differentiation such as MYOG (Fig S4). A tightly grouped third population of cells branched from the trajectory near the transition between phases. These cells lacked myogenic markers but expressed PDGFRA and SPHK1, suggesting that they are contaminating interstitial mesenchymal cells and did not arise from the myoblasts. Such cells were recently shown to stimulate muscle differentiation 18 . Monocle's estimates of the frequency and proliferative status of these cells were consistent with estimates derived from immunofluorescent stains against ANPEP/CD13 and nuclear phosphorylated H3-Ser10 (Fig S4). Monocle thus enabled analysis of the myoblast differentiation trajectory without subtracting these cells by immunopurification, maintaining in vitro differentiation kinetics that resemble physiological cell crosstalk occurring in the in vivo niche. To find genes that were dynamically regulated as the cells progressed through differentiation, we modeled each gene's expression as a nonlinear function of pseudo-time. A total of 1,061 genes were dynamically regulated during differentiation (FDR = 0.8) to their relative order within the full data set. The algorithm retained the ability to detect dynamically regulated genes with high precision (>= 95%) over all designs and with increasing recall as more of the cells were included. (Fig S7) We next grouped genes with similar trends in expression, reasoning that such groups might share common biological functions and regulators. Clustering of genes according to direction and timing revealed six distinct trends (Fig 3). Genes downregulated early or upregulated late in pseudo-time were highly enriched for biological processes central to myogenesis, including cell-cycle exit and activation of muscle-specific structural proteins. However, the other clusters included many genes with broad roles in development, including mediators of cell-cell signaling, RNA export and translational control, and remodeling of cell morphology (Fig S8). A timeseries analysis of myoblast differentiation with bulk RNA-Seq identified up and down-regulated genes, but did not identify the transient clusters or distinguish the early from late regulation visible with pseudo-temporally ordered single cells (Fig S9). Furthermore, dynamic range of expression was compressed for most genes, confirming that failure to account for variability in progress through differentiation leads directly to the effects associated with Simpson's paradox. Pseudo-temporal cell ordering thus decomposes the coarse kinetic trends produced by conventional RNA-Seq into distinct, sequential waves of transcriptional reconfiguration. To identify factors driving myoblast differentiation, we performed a cis-regulatory analysis on genes in each pseudo-temporal cluster. Cis regulatory elements were first identified based on DNaseI hypersensitive sites in HSMM cells and HSMM-derived myotubes 20 , classified according to function according to histone marks 21 , and finally annotated with conserved transcription factor binding sites. While downregulated genes were enriched at near significant levels with binding sites for genes that play roles in proliferation (e.g. MAX, E2F, and NMYC), nearly all significantly enriched motifs fell near upregulated genes. These genes were highly enriched for regulatory elements containing binding motifs for 175 transcription factors, including numerous well-known regulators of myogenesis, such as MYOD, MYOG, PBX1, MEIS1, and the MEF2 family (Fig S10). Some, but not all, of these factors were revealed by a regulatory element analysis performed using bulk RNA-Seq data, underscoring the power of increased (pseudo) temporal resolution of single-cell analysis (Fig S11). A similar analysis of microRNA target sites identified miR-1, miR-206, miR-133, and numerous others as regulators of genes activated during myogenesis (Fig S12). Of these, only miR-1/206 target sites were significantly enriched among genes found to be transiently upregulated and then sharply downregulated. This may suggest that miR-1 and miR-206, which are expressed at an intermediate stage of myoblast differentiation, may act to strongly suppress a subset of genes activated earlier. Many of the transcription factors implicated by our cis regulatory analysis to govern differentiation had no previously appreciated role in muscle development. To test potential roles of these factors we performed an RNAi mediated loss of function screen for 11 candidates. Briefly, we virally expressed proliferating myoblasts with one of 44 distinct shRNAs targeting either one of these factors or a mock (non-targeting) control, followed by serum-induced differentiation for five days. We then measured the frequency and size of myosin heavy chain 2 (MYH2)-positive cells with a high-throughput immunofluorescence pipeline. Of the targets we tested, MZF1, ZIC1, XBP1, and USF1 showed significantly altered differentiation kinetics (Fig 4a,b, Fig S13) when depleted with two or more independent hairpins (FDR < 5%). Knockdown of XBP1, USF1, ZIC1, and MZF1 enhanced myotube formation, with larger myotubes containing a higher fraction of total nuclei than mock shRNA controls. Depletion of CUX1, ARID5B, POU2F1, and AHR also increased differentiation efficiency, albeit less significantly. Importantly, whole-well nuclei counts were similar between knockdowns and mock controls, indicating that enhanced differentiation was not simply a result of higher initial cell counts or increased proliferation. With the exception of ZIC1, forced overexpression did not substantially alter differentiation kinetics (data not shown). Notably, several of these factors have binding motifs that are highly enriched in promoters and enhancers that also have motifs for known muscle regulators (Fig 4c). For example, USF1 motifs are enriched in enhancers that also have MYOD motifs. Together, these results confirm that the transcription factors identified as possible regulators in fact play a role in myoblast differentiation, and demonstrate the power of Monocle for identifying key differentiation genes. Here, we report that individual myoblasts progress through differentiation in an unsynchronized manner, but that they can be reordered according to progress through differentiation. This pseudo-time ordering pinpoints key events in differentiation that are masked both by conventional bulk cell expression profiling, and by single-cell expression profiles ordered by time collected. The reordering resolves sequentially activated transcriptional sub-programs that are regulated by common factors. The temporal resolution offered by hundreds of ordered cells might enable future efforts to computationally infer novel gene-regulatory modules. For example, the enrichment of transiently upregulated genes for common microRNA target sites raises the question of whether those microRNAs are expressed later, curtailing what would have been higher levels of expression. Sequencing-based measurements of small RNAs and mRNAs from the same cell will provide answers to such systems-level questions. Moreover, single-cell analysis distinguishes cells of interest from contaminating cell types such as interstitial mesenchymal cells without experimental isolation that might disrupt cell-cell interactions important in the in vivo niche. We identified eight previously unappreciated transcription factors that dramatically influence the course of myoblast differentiation, thus proving the principle of pseudo-temporal analysis and expanding the catalog of regulators in this well-studied system. Several of the eight factors reported here may normally repress differentiation by competing with pro-myogenic factors for these regulatory elements. Alternatively, these inhibitors may co-occupy regulatory elements with pro-myogenic factors, preventing transactivation of their targets (Fig. 4d). Previous studies in other contexts provide mechanistic data supporting both of these models. USF1 inhibits MyoD autoactivation in Xenopus by competing with MyoD at its promoter through an alternative E-box 22 . Our results suggest that USF1 may repress a broad array of targets via E-box competition. CUX1 represses targets in several developmental contexts through binding site competition 23 . XBP1 was recently reported to inhibit myoblast differentiation in mice, potentially through the mechanisms proposed here 24 . Further experiments in these HSMM cells and myoblasts from other anatomic depots will be needed to confirm the mechanism of these factors. While the positive regulators of myogenesis have been well characterized, only a handful of inhibitors have been identified. The eight inhibitors reported here may shed light on how the balance of proliferation and differentiation is maintained during development and regeneration. Ordering the expression profiles of individual cells by biological progress is thus a powerful new tool for studying cell differentiation, and could in principle be used to map regulatory networks that govern a much wider array of biological processes. Supplementary Material 1

0 comments Cited 1797 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Sara A. Rubin:

ORCID: https://orcid.org/0000-0002-3484-5461

Role: ConceptualizationRole: Data curationRole: Formal analysisRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: Project administrationRole: SoftwareRole: ValidationRole: VisualizationRole: Writing - original draftRole: Writing - review & editing

Chloé S. Baron:

ORCID: https://orcid.org/0000-0001-8351-4768

Role: ConceptualizationRole: Data curationRole: Formal analysisRole: InvestigationRole: MethodologyRole: Project administrationRole: ResourcesRole: SoftwareRole: ValidationRole: VisualizationRole: Writing - original draftRole: Writing - review & editing

Cecilia Pessoa Rodrigues:

ORCID: https://orcid.org/0000-0001-8617-6565

Role: Formal analysisRole: VisualizationRole: Writing - review & editing

Madeleine Duran:

ORCID: https://orcid.org/0000-0003-1648-2369

Role: Formal analysisRole: SoftwareRole: VisualizationRole: Writing - review & editing

Alexandra F. Corbin:

ORCID: https://orcid.org/0000-0002-1759-885X

Role: InvestigationRole: VisualizationRole: Writing - review & editing

Song P. Yang:

ORCID: https://orcid.org/0000-0002-3064-3141

Role: Data curationRole: Visualization

Cole Trapnell:

ORCID: https://orcid.org/0000-0002-8105-4347

Role: Software

Leonard I. Zon:

ORCID: https://orcid.org/0000-0003-0860-926X

Role: ConceptualizationRole: Funding acquisitionRole: Project administrationRole: SupervisionRole: Writing - review & editing

Journal

Journal ID (nlm-ta): J Exp Med

Journal ID (iso-abbrev): J Exp Med

Journal ID (publisher-id): jem

Title: The Journal of Experimental Medicine

Publisher: Rockefeller University Press

ISSN (Print): 0022-1007

ISSN (Electronic): 1540-9538

Publication date Collection: 05 September 2022

Publication date (Electronic): 08 August 2022

Volume: 219

Issue: 9

Electronic Location Identifier: e20220038

Affiliations

[1 ] Stem Cell Program and Division of Hematology/Oncology, Boston Children’s Hospital and Dana-Farber Cancer Institute, Boston, MA

[2 ] Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA

[3 ] Stem Cell and Regenerative Biology Department, Harvard University, Cambridge, MA

[4 ] Department of Genome Sciences, University of Washington, Seattle, WA

[5 ] Howard Hughes Medical Institute, Boston Children’s Hospital, Boston, MA

Author notes

Correspondence to Leonard I. Zon: zon@ 123456enders.tch.harvard.edu

Disclosures: L.I. Zon reported personal fees from Fate Therapeutics, CAMP4 Therapeutics, Amagma Therapeutics, Scholar Rock, Branch Biosciences, Celularity, and Cellarity outside the submitted work. No other disclosures were reported.

Author information

Sara A. Rubin https://orcid.org/0000-0002-3484-5461

Chloé S. Baron https://orcid.org/0000-0001-8351-4768

Cecilia Pessoa Rodrigues https://orcid.org/0000-0001-8617-6565

Madeleine Duran https://orcid.org/0000-0003-1648-2369

Alexandra F. Corbin https://orcid.org/0000-0002-1759-885X

Song P. Yang https://orcid.org/0000-0002-3064-3141

Cole Trapnell https://orcid.org/0000-0002-8105-4347

Leonard I. Zon https://orcid.org/0000-0003-0860-926X

Article

Publisher ID: jem.20220038

DOI: 10.1084/jem.20220038

PMC ID: 9365674

PubMed ID: 35938989

SO-VID: 214ff548-6aa0-40d5-a59f-a867114414d5

License:

This article is distributed under the terms of an Attribution–Noncommercial–Share Alike–No Mirror Sites license for the first six months after the publication date (see http://www.rupress.org/terms/). After six months it is available under a Creative Commons License (Attribution–Noncommercial–Share Alike 4.0 International license, as described at https://creativecommons.org/licenses/by-nc-sa/4.0/).

History

Date received : 06 January 2022

Date revision received : 11 June 2022

Date accepted : 06 July 2022

Funding

Funded by: National Heart, Lung, and Blood Institute, DOI http://dx.doi.org/10.13039/100000050;

Award ID: F30HL152628

Award ID: T32HL007574

Award ID: R01HL144780

Funded by: Human Frontier Science Program, DOI http://dx.doi.org/10.13039/100004412;

Single-cell analyses reveal early thymic progenitors and pre-B cells in zebrafish

Read this article at

Abstract

Abstract

Graphical Abstract

Related collections

Point-Of-Care Testing for Hepatitis

Most cited references 206

Comprehensive Integration of Single-Cell Data

Metascape provides a biologist-oriented resource for the analysis of systems-level datasets

Pseudo-temporal ordering of individual cells reveals dynamics and regulators of cell fate decisions

Author and article information

Contributors

Journal

Affiliations

Author notes

Author information

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 160

Cited by 9

Most referenced authors 2,773