Of the 330 × 109 cells that are estimated to turn over daily in the human body, nearly 90% correspond to blood cells . Hematopoiesis, the process of blood formation that is necessary to compensate for this regular loss of blood cells, is regulated by a complex set of external signaling cues and intrinsic fate determinants. It is also the paradigm of mammalian tissue homeostasis, sustained by hematopoietic stem and progenitor cells (HSPCs) over the lifetime of an organism. Transcription factors (TFs) have a central role in regulating hematopoiesis, particularly by impacting HSPC self-renewal and differentiation through the orchestration of coordinated expression of target genes as lineage commitment initiates and progresses . Single-cell genomics has provided unprecedented insights into the regulatory heterogeneity in individual cells, and multiple methods have been developed to predict upstream regulators of the transcriptional programs of specific cell states along the course of differentiation . The source of the transcriptional and cell fate heterogeneity in HSPCs described by these methods is multifactorial, with contributions of intrinsic transcriptional stochasticity, external signals, and technical sources .
In this review, we analyze recent advances in our understanding of the transcriptional determinants of cell fate. Given the ubiquitous use of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) to probe hematopoiesis, we highlight important sensitivity considerations on the use of these methods to identify functionally relevant subpopulations and their regulators. Insufficient transcriptome coverage exacerbates the issues due to sparse sampling of genes with low expression yet critical functions, such as TFs. This can lead to incorrect assessment of the specificity of markers or to underdetection of true determinants of cell fate. We underscore the impact of intrinsic stochastic transcriptional noise in regulators of hematopoiesis. Finally, we review several reports of novel regulators that characterize functionally relevant hematopoietic subpopulations, such as long-term reconstituting hematopoietic stem cells (LT-HSCs), while highlighting some of their limitations. Although scRNA-seq continues to reveal unparalleled insights into heterogeneity in hematopoiesis, the transcriptome may not always reflect important functional differences within subpopulations. This further underscores the critical need to marry single-cell genomics with functional assays to gain valuable biological insights.
TRANSCRIPTIONAL AND FATE HETEROGENEITY
Understanding the molecular basis of cell fate determination has been a long-standing question in the study of hematopoiesis . Recent advances in scRNA-seq have allowed massively parallel profiling of the transcriptome of tens and even hundreds of thousands of single cells [6–9]. Applied to the study of hematopoiesis, scRNA-seq has been used to obtain representations of cellular state and the factors underlying it . In particular, it has challenged the traditional view of hematopoiesis as a collection of discrete cell states with particular transition points (Fig. 1A). This notion, derived from the characterization of cell populations using a limited set of surface markers, has evolved into a continuous model of states along with hematopoietic differentiation (Fig. 1B) . Such gene expression heterogeneity is present across all canonical hematopoietic cell types, including HSPCs and lineage-restricted progenitors [10,12].
Does transcriptional heterogeneity in stem or progenitor populations also reflect differences in cell fate choice and functional potential? Weinreb et al. barcoded murine hematopoietic progenitors using heritable expressed lentiviral constructs, allowed cells to divide and profiled clones using scRNA-seq immediately after barcoding or after differentiation in vitro or in vivo[13▪▪]. This allowed the authors to associate each clone with their fate, to identify differentially expressed genes shared by progenitors with a similar fate, and to estimate the predictability of terminal fate choice from scRNA-seq gene expression measurements in progenitor populations. Notably, the most informative gene sets for fate prediction were differentially expressed genes in progenitors with a shared fate (38% and 20% more informative that a random size-matched gene set for in vitro and in vivo experiments, respectively). In contrast, TFs were only 10% and 3% more informative than the random gene set in vitro and in vivo. It is likely that additional factors not captured by scRNA-seq, such as chromatin potential or protein expression also contribute to fate determination . This notion is further reinforced by the fact that HSPC heterogeneity is not merely stochastic, since it appears to be propagated in serial transplantation experiments .
However, it is also possible that the low expression of TFs, together with the limited sensitivity of shallow sequencing scRNA-seq (i.e. the median number of unique transcripts detected per cell in each sample ranged between 1720 and 6429) contributed to this result [13▪▪]. Despite its ability to detect large-scale changes in transcriptional programs, low depth scRNA-seq might fail to distinguish genes with low expression that are functionally required in a cell from those that are not expressed . Indeed, a comparison of four scRNA-seq datasets profiling HSPCs showed that the most abundant transcripts were the main determinants of single-cell cluster assignment [16▪▪]. In contrast, critical hematopoietic genes measured by scRNA-seq, such as Gata1, Cebpa, Runx1, Myb, Zfpm1, Meis1, Mpo, and Gypa had a negligible influence. However, at a functional level, many of these key TFs and other factors have been shown to be critical determinants of cell fate .
The main contributor of zero counts in scRNA-seq data is the sequencing depth per cell, which directly contributes to the total number of unique transcripts detected per cell . An empirically determined threshold on the minimal number of unique transcripts detected per cell attempts to mitigate this issue [15,18]. As a consequence, the resulting per gene count distributions more closely resemble those from single-molecule RNA fluorescence in situ hybridization (smRNA FISH), considered to be the gold standard to recapitulate ground truth RNA distributions (Fig. 2A). Moreover, recent advances in single-cell genomics have achieved significantly increased molecular recovery without compromising cell number throughput, allowing for the profiling of hundreds of thousands of unique transcripts in a cell with deep sequencing coverage [19–21]. These considerations of sequencing depth and transcript coverage are critical to faithfully probe the complete transcriptional landscape of hematopoiesis, particularly given the now-routine use of scRNA-seq to profile subpopulations within the hematopoietic hierarchy. Without adequate functional assays following transcriptomic studies, technical noise creates the risk of misassignment of the specificity of certain markers to isolate subpopulations of interest.
INTRINSIC TRANSCRIPTIONAL NOISE IN THE REGULATORS OF HEMATOPOIESIS
As a result of the stochastic activation and inactivation of promoters, the transcription of genes is thought to occur in ‘bursts’, with frequent episodes of monoallelic expression [22,23]. Beyond the aforementioned technical factors, this phenomenon makes mammalian gene expression inherently noisy and further increases the challenge to define transcriptionally homogeneous populations, even when cells may truly be in the identical cell state .
Wheat et al. leveraged smRNA FISH on murine HSPCs to shed light on how robust hematopoietic cell fates arise within such intrinsically noisy biological systems [16▪▪]. To this end, they undertook characterization of the stochastic transcriptional noise present for three central regulators of hematopoiesis: Pu.1/Spi1, Gata1, and Gata2. The PU.1 (SPI1) TF is a negative regulator of erythroid differentiation and is required for terminal myeloid differentiation . Conversely, GATA1 and GATA2 play crucial roles in erythroid differentiation, and their mutations have been linked to several hematological disorders [26–28]. Together, these TFs have been conceptualized as giving rise to a minimal regulatory network, given their opposing effects on erythroid and granulocyte/monocyte cell fate commitment [29,30].
Despite their antagonistic roles, smRNA FISH measurements showed that stochastic transcriptional bursting in HSPCs and common myeloid progenitors often resulted in co-expression of Pu.1 and the Gata TFs (Fig. 2B) [16▪▪]. Time-lapse microscopy tracking of individual HSPC clones showed that their progeny tended to be in related transcriptional states, suggesting a certain degree of transcriptional priming. However, the dynamics of the system were best explained by a model in which stochastic and reversible transitions occurred between states defined by the expression of the aforementioned TFs (i.e. Gata1/2hi, Gata2hi, Pu.1hi states, and a low expression state for the three TFs). This suggested that stochastic processes dominated by intrinsic noise could underlie the seemingly deterministic behaviors of hematopoiesis. As such, intrinsic noise derived from transcriptional bursting would facilitate the transcriptional plasticity required for balancing differentiation and self-renewal in stem cells. Although this work reported an approach with high molecular sensitivity, defining cell states with only three TFs could not fully predict the past or future states of a cell. Future studies that characterize stochastic transcriptional noise in thousands of genes with single-cell resolution might uncover how stochastic transcriptional noise operates more globally during and plays a role in the process of hematopoiesis . The insights uncovered by these studies also warrant caution against the definition of discrete subpopulations based on the expression of a handful of genes (or often a single one) in scRNA-seq, especially given the stochastic transitions between related states.
FUNCTIONAL CHARACTERIZATION OF SUBPOPULATIONS DEFINED USING SINGLE-CELL TRANSCRIPTOMICS
Continuous and stochastic models of regulation based on expression dynamics are also influencing new approaches to isolate functionally relevant HSPC populations. Figure 3 illustrates the general workflow from identification of candidate subpopulations using single-cell genomics, to flow cytometric isolation, and functional validation. As such, devising relevant functional validation studies and understanding their limitations is a critical step in this process, since it determines the contexts to which conclusions from markers or regulators can be extended.
LT-HSCs have been operationally defined based on their ability to give rise to multiple lineages for more than 16 weeks upon primary transplantation and at least in a subsequent secondary round of transplantation . In recent years, panels of surface antigens have been proposed to isolate human LT-HSCs from ex vivo cultures, such as EPCR (CD201) and ITGA3 (CD49c) [32,33].
To simplify such complex antibody panels, Lehnertz et al. devised a reporter of hepatic leukemia factor (HLF) expression to isolate functional human LT-HSCs [34▪]. The HLF TF is strongly and characteristically expressed in immunophenotypically defined LT-HSCs [34▪]. Furthermore, HSPCs from Hlf-knockout mice exhibit a reduced ability to reconstitute hematopoiesis upon serial transplantation [35,36]. Moreover, HLF is among the six TFs that were used to reprogram committed mouse progenitors into induced HSCs . To this end, a fluorescent reporter gene was edited in the endogenous HLF locus using CRISPR genome editing in human cord blood HSPCs [34▪]. Reporter-positive cells expressed characteristic surface antigens of LT-HSCs. Upon transplantation into immunodeficient mice, reporter-positive cells indeed showed high reconstitution capacity and multipotency. This is the first report of transgenic labeling of human HSCs, demonstrating its potential to use the transcriptome to isolate functionally meaningful subpopulations.
Single-cell transcriptomics and endogenous reporters were also leveraged to identify the expression of the Tcf15 TF as characteristic of mouse LT-HSCs [38▪]. Rodriguez-Fraticelli et al. used single-cell transcriptomics and lentivirally barcoded hematopoietic progenitors during long-term bone marrow reconstitution. This allowed the authors to characterize the transcriptional signatures of low-output HSC clones (i.e. clones with a low ratio of committed progenitors relative to HSCs). Tcf15 was one of the TFs that characterized low-output HSCs compared to their high-output counterparts. Of note, the transcriptional signature of low-output HSCs in this study resembled that of published immunophenotypically defined murine LT-HSCs. One of the hits of an in vivo CRISPR screen of genes that affected HSC output was Tcf15, whose knockout resulted in the loss of immunophenotypic LT-HSCs and impaired long-term engraftment potential in secondary transplantation. In turn, overexpression of Tcf15 caused a 20.8-fold increase in the frequency of LT-HSCs, and reporter-positive cells showed higher reconstitution activity than negative cells upon transplantation. Taken together, this work supports a critical role of Tcf15 in the long-term repopulation potential of murine LT-HSCs.
Using single-cell transcriptomics, flow cytometry, and functional validation studies, Amann-Zalcestein et al. identified a subpopulation within mouse lymphoid-primed multipotent progenitors (LMPPs) that harbors almost fully restricted lymphoid potential [39▪]. As a whole, LMPPs have been shown to be heterogeneous in their fate bias, comprising progenitors biased toward lymphoid, myeloid, and dendritic cell fates . This heterogeneity was also present transcriptionally, with lymphoid-like and myeloid-like clusters. The Dach1 TF was differentially expressed in the myeloid-associated cluster, positively correlated with myeloid and stem-like genes, and negatively correlated with lymphoid genes. An endogenous Dach1 reporter showed high expression in LT-HSCs, heterogeneity in its expression within the LMPP compartment, and negative expression within the more committed common lymphoid progenitors. This prompted the hypothesis that reporter-negative LMPPs from Dach1 reporter mice identified a transcriptionally distinct subpopulation with restricted lymphoid potential, termed lymphoid-primed progenitors (LPPs). Indeed, LPPs produced more T and B cells per clone and had minimal myeloid potential in vitro and in vivo compared to Dach1 reporter-positive cells. Notably, this population was not identifiable using standard HSPC markers or previously reported markers of lymphoid priming [39▪].
Taken together, these studies exemplify the power of coupling single-cell transcriptomics with functional validation studies for the study of regulation in hematopoiesis. At the same time, they also highlight limitations present with current approaches. For instance, studies that require transplantation assays to assess clonogenic capacity assume that the regulation of hematopoiesis in native and posttransplantation hematopoietic states may be similar. However, major differences exist between the two (i.e. normal hematopoiesis displays low individual HSC contribution, whereas posttransplantation hematopoiesis is dominated by a few HSC clones) . Moreover, reporter expression may not always faithfully mimic endogenous expression, especially if posttranscriptional regulation of the endogenous transcript has a strong influence in determining protein levels, as we discuss in the final section.
FUTURE PERSPECTIVES: CHARTING HETEROGENEITY IN HEMATOPOIESIS ACROSS THE CENTRAL DOGMA
Single-cell transcriptomics have allowed unprecedented insights into the transcriptional heterogeneity in hematopoiesis, uncovering previously unnoticed functionally distinct subpopulations and regulators. Some of these markers hold the unique translational potential to prospectively isolate clinically relevant progenitors for ex vivo applications. However, in the future, it will be crucial to tease out the degree of stochasticity transfer across the central dogma in hematopoiesis, from RNA to protein. This is critical because reporter expression may not always faithfully mimic endogenous expression of the gene under study. Moreover, the delay created by nuclear export has been postulated as a buffer for transcriptional noise in RNA transcripts, which together with the longer half-lives of proteins compared to their transcriptional counterparts, contributes to smaller variance in their abundance across single cells [42,43]. Joint RNA and protein analysis in single cells holds the promise to uncover transcriptional regulation using correlations between TFs and putative target genes at the RNA level, and posttranscriptional regulation in the joint distributions of RNA and protein of a given gene . A recent example is the finding that BCL11A, a master regulator of fetal hemoglobin switching, is regulated at the level of messenger RNA translation in hematopoietic development by the RNA-binding protein LIN28B [44,45]. Moreover, posttranscriptional regulatory mechanisms may vary between distinct stages of development, as well, adding another layer of complexity . As such, recent advances allow single-cell measurements of protein abundance, joint measurements of intracellular TFs and single-cell transcriptomes, joint chromatin and single-cell RNA profiling, joint chromatin and protein measurements, and joint profiling of histone modifications and single-cell transcriptomes [14,18,47–51]. Although these techniques will provide novel insights into additional dimensions of cell state, they will need to go hand in hand with relevant functional validation assays to unequivocally define clinically and biologically relevant cell states and their regulators.
We thank members of the Sankaran laboratory for valuable discussions. JDMR is supported by La Caixa Foundation and the Real Colegio Complutense at Harvard. The Sankaran laboratory is supported by the New York Stem Cell Foundation, a gift from the Lodish Family to Boston Children's Hospital, and National Institutes of Health Grants R01 DK103794, R01 HL146500, and R56 DK125234. VGS is a New York Stem Cell Foundation-Robertson Investigator. Illustrations for all figures were created with Biorender.com.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
REFERENCES AND RECOMMENDED READING
Papers of particular interest, published within the annual period of review, have been highlighted as:
▪ of special interest
▪▪ of outstanding interest
1. Sender R, Milo R. The distribution of cellular turnover in the human body. Nat Med 2021; 27:45–48.
2. Duddu S, Chakrabarti R, Ghosh A, Shukla PC. Hematopoietic stem cell transcription factors in cardiovascular pathology. Front Genet 2020; 11:588602.
3. Shema E, Bernstein BE, Buenrostro JD. Single-cell and single-molecule epigenomics to uncover genome regulation at unprecedented resolution. Nat Genet 2019; 51:19–25.
4. Laurenti E, Göttgens B. From haematopoietic stem cells to complex differentiation landscapes. Nature 2018; 553:418–426.
5. Wagers AJ, Christensen JL, Weissman IL. Cell fate determination from stem cells. Gene Ther 2002; 9:606–612.
6. Macosko EZ, Basu A, Satija R, et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 2015; 161:1202–1214.
7. Hagemann-Jensen M, Ziegenhain C, Chen P, et al. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat Biotechnol 2020; 38:708–714.
8. Cao J, Spielmann M, Qiu X, et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 2019; 566:496–502.
9. Datlinger P, Rendeiro AF, Boenke T, et al. Ultra-high throughput single-cell RNA sequencing by combinatorial fluidic indexing. bioRxiv 2019; 2019.12.17.879304.
10. Pellin D, Loperfido M, Baricordi C, et al. A comprehensive single cell transcriptional landscape of human hematopoietic progenitors. Nat Commun 2019; 10:1–15.
11. Liggett LA, Sankaran VG. Unraveling hematopoiesis through the lens of genomics. Cell 2020; 182:1384–1400.
12. Rodriguez-Fraticelli AE, Camargo F. Systems analysis of hematopoiesis using single-cell lineage tracing. Curr Opin Hematol 2021; 28:18–27.
13▪▪. Weinreb C, Rodriguez-Fraticelli A, Camargo F, Klein AM. Lineage tracing on transcriptional landscapes links state to fate during differentiation. Science 2020; 367:eaaw3381.
14. Ma S, Zhang B, LaFave LM, et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 2020; 183:1103–1116.e20.
15. Torre E, Dueck H, Shaffer S, et al. Rare cell detection by single-Cell RNA sequencing as guided by single-molecule RNA FISH. Cell Syst 2018; 6:171–179.e5.
16▪▪. Wheat JC, Sella Y, Willcockson M, et al. Single-molecule imaging of transcription dynamics in somatic stem cells. Nature 2020; 583:431–436.
17. Choi K, Chen Y, Skelly DA, Churchill GA. Bayesian model selection reveals biological origins of zero inflation in single-cell transcriptomics. Genome Biol 2020; 21:183.
18. Specht H, Emmott E, Petelski AA, et al. Single-cell proteomic and transcriptomic analysis of macrophage heterogeneity using SCoPE2. Genome Biol 2021; 22:50.
19. Hughes TK, Wadsworth MH, Gierahn TM, et al. Second-strand synthesis-based massively parallel scRNA-Seq reveals cellular states and molecular features of human inflammatory skin pathologies. Immunity 2020; 53: 878-L894.e7.
20. Mulqueen RM, Pokholok D, O’connell BL, et al. High-content single-cell combinatorial indexing. bioRxiv 2021; 2021.01.11.425995.
21. Qiu Q, Hu P, Qiu X, et al. Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat Methods 2020; 17:991–1001.
22. Ochiai H, Hayashi T, Umeda M, et al. Genome-wide kinetic properties of transcriptional bursting in mouse embryonic stem cells. Sci Adv 2020; 6:6699–6716.
23. Larsson AJM, Johnsson P, Hagemann-Jensen M, et al. Genomic encoding of transcriptional burst kinetics. Nature 2019; 565:251–254.
24. Lenstra TL, Rodriguez J, Chen H, Larson DR. Transcription dynamics in living cells. Annu Rev Biophys 2016; 45:25–47.
25. Kastner P, Chan S. PU.1: A crucial and versatile player in hematopoiesis and leukemia. Int J Biochem Cell Biol 2008; 40:22–27.
26. Sankaran VG, Ghazvinian R, Do R, et al. Exome sequencing identifies GATA1 mutations resulting in Diamond-Blackfan anemia. J Clin Investig 2012; 122:2439–2443.
27. Crispino JD, Horwitz MS. GATA factor mutations in hematologic disease. Blood 2017; 129:2103–2110.
28. Abdulhay N, Fiorini C, Verboon J, Ludwig L, et al. Impaired human hematopoiesis due to a cryptic intronic GATA1 splicing mutation. J Exp Med 2019; 216:1050–1060.
29. Burda P, Laslo P, Stopka T. The role of PU.1 and GATA-1 transcription factors during normal and leukemogenic hematopoiesis. Leukemia 2010; 24:1249–1257.
30. Moriguchi T, Yamamoto M. A regulatory network governing Gata1 and Gata2 gene transcription orchestrates erythroid lineage differentiation. Int J Hematol 2014; 100:417–424.
31. Shah S, Takei Y, Zhou W, et al. Dynamics and spatial genomics of the nascent transcriptome by intron seqFISH. Cell 2018; 174:363–376.e16.
32. Tomellini E, Fares I, Lehnertz B, et al. Integrin-α3 is a functional marker of ex vivo expanded human long-term hematopoietic stem cells. Cell Rep 2019; 28:1063–1073 e5.
33. Fares I, Chagraoui J, Lehnertz B, et al. EPCR expression marks UM171-expanded CD34+ cord blood stem cells. Blood 2017; 129:3344–3351.
34▪. Lehnertz B, MacRae T, Chagraoui J, et al. HLF expression defines the human haematopoietic stem cell state. bioRxiv 2020; 06. 29. 177709.
35. Wahlestedt M, Ladopoulos V, Hidalgo I, et al. Critical modulation of hematopoietic lineage fate by hepatic leukemia factor. Cell Rep 2017; 21:2251–2263.
36. Komorowska K, Doyle A, Wahlestedt M, et al. Hepatic leukemia factor maintains quiescence of hematopoietic stem cells and protects the stem cell pool during regeneration. Cell Rep 2017; 21:3514–3523.
37. Riddell J, Gazit R, Garrison BS, et al. Reprogramming committed murine blood cells to induced hematopoietic stem cells with defined factors. Cell 2014; 157:549–564.
38▪. Rodriguez-Fraticelli AE, Weinreb C, Wang SW, et al. Single-cell lineage tracing unveils a role for TCF15 in haematopoiesis. Nature 2020; 583:585–589.
39▪. Amann-Zalcenstein D, Tian L, Schreuder J, et al. A new lymphoid-primed progenitor marked by Dach1 downregulation identified with single cell multiomics. Nat Immunol 2020; 21:1574–1584.
40. Naik SH, Perié L, Swart E, et al. Diverse and heritable lineage imprinting of early haematopoietic progenitors. Nature 2013; 496:229–232.
41. Busch K, Rodewald HR. Unperturbed vs. posttransplantation hematopoiesis: both in vivo but different. Curr Opin Hematol 2016; 23:295–303.
42. Stoeger T, Battich N, Pelkmans L. Passive noise filtering by cellular compartmentalization. Cell 2016; 164:1151–1161.
43. Suter DM, Molina N, Gatfield D, et al. Mammalian genes are transcribed with widely different bursting kinetics. Science 2011; 332:472–474.
44. Basak A, Munschauer M, Lareau CA, et al. Control of human hemoglobin switching by LIN28B-mediated regulation of BCL11A translation. Nat Genet 2020; 52:138–145.
45. Sankaran VG, Menne TF, Xu J, et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 2008; 322:1839–1842.
46. Magee J, Signer R. Developmental stage-specific changes in protein synthesis differentially sensitize hematopoietic stem cells and erythroid progenitors to impaired ribosome biogenesis. Stem Cell Rep 2021; 16:20–28.
47. Chung H, Parkhurst CN, Magee EM, et al. Simultaneous single cell measurements of intranuclear proteins and gene expression. bioRxiv 2021; 2021.01.18.427139.
48. Trevino AE, Müller F, Andersen J, et al. Chromatin and gene-regulatory dynamics of the developing human cerebral cortex at single-cell resolution. bioRxiv 2020; 2020.12.29.424636.
49. Stoeckius M, Hafemeister C, Stephenson W, et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 2017; 14:865–868.
50. Mimitou EP, Lareau CA, Chen KY, et al. Scalable, multimodal profiling of chromatin accessibility and protein levels in single cells. bioRxiv 2020; 2020.09.08.286914.
51. Zhu C, Zhang Y, Li YE, et al. Joint profiling of histone modifications and transcriptome in single cells from mouse brain. Nat Methods 2021; 18:283–292.