Abstract
Dynamic 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications to DNA regulate gene expression in a cell-type-specific manner and are associated with various biological processes, but the two modalities have not yet been measured simultaneously from the same genome at the single-cell level. Here we present SIMPLE-seq, a scalable, base resolution method for joint analysis of 5mC and 5hmC from thousands of single cells. Based on orthogonal labeling and recording of ‘C-to-T’ mutational signals from 5mC and 5hmC sites, SIMPLE-seq detects these two modifications from the same molecules in single cells and enables unbiased DNA methylation dynamics analysis of heterogeneous biological samples. We applied this method to mouse embryonic stem cells, human peripheral blood mononuclear cells and mouse brain to give joint epigenome maps at single-cell and single-molecule resolution. Integrated analysis of these two cytosine modifications reveals distinct epigenetic patterns associated with divergent regulatory programs in different cell types as well as cell states.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
Raw sequencing and processed data generated in this study are available from the Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo/) with accession number GSE197740 (ref. 113). Other external datasets were downloaded from the GEO with the following accession numbers: WGBS and TAPS of mESC55 (GSE112520), RNA sequencing of mESC (2i and serum)66 (GSE23943), TAB-seq of mESC46 (GSE36173), Joint-snhmC-seq of mouse brain82 (GSE236798), Paired-Tag of mouse brain83 (GSE152020); ArrayExpress with the following accession numbers: scRNA-seq of mESC114 (E-MTAB-2600); 10x Genomics website: scRNA-seq of PBMCs (https://www.10xgenomics.com); ENCODE with the following accession numbers: DNase-seq ChIP-seq of E14 mESC (ENCSR000CMW), H3K4me1 ChIP-seq of E14 mESC (ENCSR000CGN), H3K27ac ChIP-seq of E14 mESC (ENCSR000CGQ), H3K4me1 ChIP-seq of immune cells (ENCSR777RWW (CD4+ T cell), ENCSR631BPS (CD8+ T cells), ENCSR214VUB (B cells), ENCSR963TKB (NK cells) and ENCSR400VWA (monocytes)), H3K4me3 ChIP-seq of immune cells (ENCSR263WLD (CD4+ T cells), ENCSR231FDF (CD8+ T cells), ENCSR269OVV (B cells), ENCSR570AUC (NK cells) and ENCSR796FCS (monocytes)), H3K27ac ChIP-seq of immune cells (ENCSR546SDM (CD4+ T cells), ENCSR835OJV (CD8+ T cells), ENCSR191ZQT (B cells), ENCSR391EQV (NK cells) and ENCSR012PII (monocytes)), H3K27me3 ChIP-seq of immune cells (ENCSR043SBG (CD4+ T cells), ENCSR797GOJ (CD8+ T cells), ENCSR522EGW (B cells), ENCSR939JZW (NK cells) and ENCSR080XUB (monocytes)), H3K9me3 ChIP-seq of immune cells (ENCSR453GNY (CD4+ T cells), ENCSR905SHH (CD8+ T cells), ENCSR295PSK (B cells), ENCSR021FSY (NK cells) and ENCSR236JVK (monocytes)), H3K36me3 ChIP-seq of immune cells (ENCSR828WZG (CD4+ T cells), ENCSR694CDP (CD8+ T cells), ENCSR789RGI (B cells), ENCSR519SOC (NK cells) and ENCSR244XWL (monocytes)) and ChromHMM states of NK cells (ENCSR972ZND).
Code availability
Custom scripts used for analyzing SIMPLE-seq datasets are available from GitHub (https://github.com/cxzhu/SIMPLE-seq)115.
References
Kelsey, G., Stegle, O. & Reik, W. Single-cell epigenomics: recording the past and predicting the future. Science 358, 69–75 (2017).
Zhu, H., Wang, G. & Qian, J. Transcription factors as readers and effectors of DNA methylation. Nat. Rev. Genet. 17, 551–565 (2016).
Bhutani, N., Burns, D. M. & Blau, H. M. DNA demethylation dynamics. Cell 146, 866–872 (2011).
Wu, X. & Zhang, Y. TET-mediated active DNA demethylation: mechanism, function and beyond. Nat. Rev. Genet. 18, 517–534 (2017).
Luo, C., Hajkova, P. & Ecker, J. R. Dynamic DNA methylation: in the right place at the right time. Science 361, 1336–1340 (2018).
Smith, Z. D. & Meissner, A. DNA methylation: roles in mammalian development. Nat. Rev. Genet. 14, 204–220 (2013).
Horvath, S. & Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 19, 371–384 (2018).
Parry, A., Rulands, S. & Reik, W. Active turnover of DNA methylation during cell fate decisions. Nat. Rev. Genet. 22, 59–66 (2021).
Crawford, G. E. et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 16, 123–131 (2006).
Buenrostro, J. D., Giresi, P. G., Zaba, L. C., Chang, H. Y. & Greenleaf, W. J. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position. Nat. Methods 10, 1213–1218 (2013).
Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein–DNA interactions. Science 316, 1497–1502 (2007).
Lister, R. et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).
Consortium, E. P. et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
Tang, F. et al. mRNA-seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Nagano, T. et al. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure. Nature 502, 59–64 (2013).
Jin, W. et al. Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples. Nature 528, 142–146 (2015).
Cusanovich, D. A. et al. Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348, 910–914 (2015).
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
Preissl, S. et al. Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation. Nat. Neurosci. 21, 432–439 (2018).
Rotem, A. et al. Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015).
Harada, A. et al. A chromatin integration labelling method enables epigenomic profiling with lower input. Nat. Cell Biol. 21, 287–296 (2019).
Kaya-Okur, H. S. et al. CUT&Tag for efficient epigenomic profiling of small samples and single cells. Nat. Commun. 10, 1930 (2019).
Carter, B. et al. Mapping histone modifications in low cell number and single cells using antibody-guided chromatin tagmentation (ACT-seq). Nat. Commun. 10, 3747 (2019).
Ku, W. L. et al. Single-cell chromatin immunocleavage sequencing (scChIC-seq) to profile histone modification. Nat. Methods 16, 323–325 (2019).
Wang, Q. et al. CoBATCH for high-throughput single-cell epigenomic profiling. Mol. Cell 76, 206–216 (2019).
Ai, S. et al. Profiling chromatin states using single-cell itChIP-seq. Nat. Cell Biol. 21, 1164–1172 (2019).
Grosselin, K. et al. High-throughput single-cell ChIP-seq identifies heterogeneity of chromatin states in breast cancer. Nat. Genet. 51, 1060–1066 (2019).
Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).
Mooijman, D., Dey, S. S., Boisset, J. C., Crosetto, N. & van Oudenaarden, A. Single-cell 5hmC sequencing reveals chromosome-wide cell-to-cell variability and enables lineage reconstruction. Nat. Biotechnol. 34, 852–856 (2016).
Zhu, C. et al. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell 20, 720–731 (2017).
Wu, X., Inoue, A., Suzuki, T. & Zhang, Y. Simultaneous mapping of active DNA demethylation and sister chromatid exchange in single cells. Genes Dev. 31, 511–523 (2017).
Luo, C. et al. Single-cell methylomes identify neuronal subtypes and regulatory elements in mammalian cortex. Science 357, 600–604 (2017).
Lake, B. B. et al. Integrative single-cell analysis of transcriptional and epigenetic states in the human adult brain. Nat. Biotechnol. 36, 70–80 (2018).
Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380–1385 (2018).
Chen, S., Lake, B. B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat. Biotechnol. 37, 1452–1457 (2019).
Zhu, C. et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat. Struct. Mol. Biol. 26, 1063–1070 (2019).
Luo, C. et al. Single nucleus multi-omics identifies human cortical cell regulatory genome diversity. Cell Genom. 2, 100107 (2022).
Xie, Y. et al. Droplet-based single-cell joint profiling of histone modifications and transcriptomes. Nat. Struct. Mol. Biol. 30, 1428–1433 (2023).
Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).
Farlik, M. et al. Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics. Cell Rep. 10, 1386–1397 (2015).
Mulqueen, R. M. et al. Highly scalable generation of DNA methylation profiles in single cells. Nat. Biotechnol. 36, 428–431 (2018).
Nichols, R. V. et al. High-throughput robust single-cell DNA methylation profiling with sciMETv2. Nat. Commun. 13, 7627 (2022).
Wangsanuwat, C., Chialastri, A., Aldeguer, J. F., Rivron, N. C. & Dey, S. S. A probabilistic framework for cellular lineage reconstruction using integrated single-cell 5-hydroxymethylcytosine and genomic DNA sequencing. Cell Rep. Methods 1, 100060 (2021).
Yu, M. et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).
Zeng, H. et al. Bisulfite-free, nanoscale analysis of 5-hydroxymethylcytosine at single base resolution. J. Am. Chem. Soc. 140, 13190–13194 (2018).
Schutsky, E. K. et al. Nondestructive, base-resolution sequencing of 5-hydroxymethylcytosine using a DNA deaminase. Nat. Biotechnol. https://doi.org/10.1038/nbt.4204 (2018).
Sun, Z. et al. High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells. Cell Rep 3, 567–576 (2013).
Liu, Y. et al. Subtraction-free and bisulfite-free specific sequencing of 5-methylcytosine and its oxidized derivatives at base resolution. Nat. Commun. 12, 618 (2021).
Cohen-Karni, D. et al. The MspJI family of modification-dependent restriction endonucleases for epigenetic studies. Proc. Natl Acad. Sci. USA 108, 11040–11045 (2011).
Vaisvila, R. et al. Enzymatic methyl sequencing detects DNA methylation at single-base resolution from picograms of DNA. Genome Res 31, 1280–1289 (2021).
Sen, M. et al. Strand-specific single-cell methylomics reveals distinct modes of DNA demethylation dynamics during early mammalian development. Nat. Commun. 12, 1286 (2021).
Fullgrabe, J. et al. Simultaneous sequencing of genetic and epigenetic bases in DNA. Nat. Biotechnol. 41, 1457–1464 (2023).
Liu, Y. B. et al. Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution. Nat. Biotechnol. 37, 424–429 (2019).
Kriaucionis, S. & Heintz, N. The nuclear DNA base 5-hydroxymethylcytosine is present in Purkinje neurons and the brain. Science 324, 929–930 (2009).
Booth, M. J. et al. Quantitative sequencing of 5-methylcytosine and 5-hydroxymethylcytosine at single-base resolution. Science 336, 934–937 (2012).
Xia, B. et al. Bisulfite-free, base-resolution analysis of 5-formylcytosine at the genome scale. Nat. Methods 12, 1047–1050 (2015).
Ito, S. et al. Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine. Science 333, 1300–1303 (2011).
Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding. Science 360, 176–182 (2018).
Daley, T. & Smith, A. D. Predicting the molecular complexity of sequencing libraries. Nat. Methods 10, 325–327 (2013).
Sim, Y. J. et al. 2i maintains a naive ground state in ESCs through two distinct epigenetic mechanisms. Stem Cell Rep. 8, 1312–1328 (2017).
Hashimoto, H. et al. Recognition and potential mechanisms for replication and erasure of cytosine hydroxymethylation. Nucleic Acids Res. 40, 4841–4849 (2012).
Yildirim, O. et al. Mbd3/NURD complex regulates expression of 5-hydroxymethylcytosine marked genes in embryonic stem cells. Cell 147, 1498–1510 (2011).
Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).
Marks, H. et al. The transcriptional and epigenomic foundations of ground state pluripotency. Cell 149, 590–604 (2012).
Kobayashi, T. et al. The cyclic gene Hes1 contributes to diverse differentiation responses of embryonic stem cells. Genes Dev. 23, 1870–1875 (2009).
Bylund, M., Andersson, E., Novitch, B. G. & Muhr, J. Vertebrate neurogenesis is counteracted by Sox1–3 activity. Nat. Neurosci. 6, 1162–1168 (2003).
Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126, 663–676 (2006).
Habibi, E. et al. Whole-genome bisulfite sequencing of two distinct interconvertible DNA methylomes of mouse embryonic stem cells. Cell Stem Cell 13, 360–369 (2013).
Tahiliani, M. et al. Conversion of 5-methylcytosine to 5-hydroxymethylcytosine in mammalian DNA by MLL partner TET1. Science 324, 930–935 (2009).
McLean, C. Y. et al. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 28, 495–501 (2010).
Spruijt, C. G. et al. Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152, 1146–1159 (2013).
Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. P10008 (2008).
Nish, S. A. et al. CD4+ T cell effector commitment coupled to self-renewal by asymmetric cell divisions. J. Exp. Med. 214, 39–47 (2017).
Wang, K., Wei, G. & Liu, D. CD19: a biomarker for B cell development, lymphoma diagnosis and therapy. Exp. Hematol. Oncol. 1, 36 (2012).
Egwuagu, C. E. STAT3 in CD4+ T helper cell differentiation and inflammatory diseases. Cytokine 47, 149–156 (2009).
Tsukumo, S. et al. Bach2 maintains T cells in a naive state by suppressing effector memory-related genes. Proc. Natl Acad. Sci. USA 110, 10735–10740 (2013).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Hellman, A. & Chess, A. Gene body-specific methylation on the active X chromosome. Science 315, 1141–1143 (2007).
Stroud, H., Feng, S., Morey Kinney, S., Pradhan, S. & Jacobsen, S. E. 5-Hydroxymethylcytosine is associated with enhancers and gene bodies in human embryonic stem cells. Genome Biol. 12, R54 (2011).
Fabyanic, E. B. et al. Joint single-cell profiling resolves 5mC and 5hmC and reveals their distinct gene regulatory effects. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01652-0 (2023).
Zhu, C. et al. Joint profiling of histone modifications and transcriptome in single cells from mouse brain. Nat. Methods 18, 283–292 (2021).
Theriault, F. M., Roy, P. & Stifani, S. AML1/Runx1 is important for the development of hindbrain cholinergic branchiovisceral motor neurons and selected cranial sensory neurons. Proc. Natl Acad. Sci. USA 101, 10343–10348 (2004).
Matuzelski, E. et al. Transcriptional regulation of Nfix by NFIB drives astrocytic maturation within the developing spinal cord. Dev. Biol. 432, 286–297 (2017).
Viswanathan, R. et al. DARESOME enables concurrent profiling of multiple DNA modifications with restriction enzymes in single cells and cell-free DNA. Sci. Adv. 9, eadi0197 (2023).
Chialastri, A., Sarkar, S., Schauer, E. E., Lamba, S. & Dey, S. S. Combinatorial quantification of 5mC and 5hmC at individual CpG dyads and the transcriptome in single cells reveals modulators of DNA methylation maintenance fidelity. Preprint at bioRxiv https://doi.org/10.1101/2023.05.06.539708 (2023).
Shahjalal, H. M., Abdal Dayem, A., Lim, K. M., Jeon, T. I. & Cho, S. G. Generation of pancreatic β cells for treatment of diabetes: advances and challenges. Stem Cell Res. Ther. 9, 355 (2018).
Lareau, C. A. et al. Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nat. Biotechnol. 37, 916–924 (2019).
Datlinger, P. et al. Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing. Nat. Methods 18, 635–642 (2021).
Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods 14, 865–868 (2017).
Li, G. et al. Joint profiling of DNA methylation and chromatin architecture in single cells. Nat. Methods 16, 991–993 (2019).
Lee, D. S. et al. Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nat. Methods 16, 999–1006 (2019).
Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).
Chung, H. et al. Joint single-cell measurements of nuclear proteins and RNA in vivo. Nat. Methods 18, 1204–1212 (2021).
Zhang, B. et al. Characterizing cellular heterogeneity in chromatin state with scCUT&Tag-pro. Nat. Biotechnol. 40, 1220–1230 (2022).
Chen, A. F. et al. NEAT-seq: simultaneous profiling of intra-nuclear proteins, chromatin accessibility and gene expression in single cells. Nat. Methods 19, 547–553 (2022).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).
Qiu, P. Embracing the dropouts in single-cell RNA-seq analysis. Nat. Commun. 11, 1169 (2020).
Xu, H. et al. Modular oxidation of cytosine modifications and their application in direct and quantitative sequencing of 5-hydroxymethylcytosine. J. Am. Chem. Soc. 145, 7095–7100 (2023).
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).
Chan, K. C. et al. Noninvasive detection of cancer-associated genome-wide hypomethylation and copy number aberrations by plasma DNA bisulfite sequencing. Proc. Natl Acad. Sci. USA 110, 18761–18768 (2013).
Song, C. X. et al. 5-Hydroxymethylcytosine signatures in cell-free DNA provide information about tumor types and stages. Cell Res. 27, 1231–1242 (2017).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Krueger, F. Trim Galore. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ (2019).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2008).
Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
Sherman, B. T. et al. DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Res. 50, W216–W221 (2022).
Bai, D., Zhu, C. & Yi, C. Single-cell joint analysis of 5-methylcytosine and 5-hydroxymethylcytosine. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE197740 (2023).
Kolodziejczyk, A. A. et al. Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17, 471–485 (2015).
Bai, D., Zhu, C. & Yi, C. Custom scripts and pipeline for SIMPLE-seq data analysis. https://github.com/cxzhu/SIMPLE-seq (2023).
Acknowledgements
We thank Y. Zhuang, J. Song, H. Zeng, C.-G. Ji, H.-W. Meng and Z.-R. Xu for technical assistance; P. Du and F.-C. Tang (Peking University) for providing mESCs; G.-L. Xu (Chinese Academy of Sciences) for providing mTET1CD plasmid; J.-Y. Xiao (Peking University) for protein purification assistance; and Z. Xi (Nankai University) for discussion. We thank the National Center for Protein Sciences at Peking University for technical help. We carried out data analysis on the High-Performance Computing Platform at the School of Life Sciences, Peking University. This study is supported by the Ministry of Science and Technology of China (no. 2023YFC3402200, no. 2019YFA0110900 and no. 2019YFA0802201 to C.Y.), the Beijing Natural Science Foundation (no. Z220013 to C.Y.) and the National Natural Science Foundation of China (no. 91953201 and no. 21825701 to C.Y.).
Author information
Authors and Affiliations
Contributions
D.B., C.Z. and C.Y. conceived and designed the study and wrote the paper. D.B. developed and optimized the SIMPLE-seq protocol and generated the data. C.Z. performed pilot labeling experiments, with help from D.B., and data analysis, with help from D.B. and X.Z. All authors discussed the results and edited the paper. C.Y. supervised the study.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Sequential chemical labeling enables simultaneous detection of 5mC and 5hmC bases on the same molecules.
a, Overview of TAPS and hmC-CATCH. b, Sanger sequencing results showing the ‘C-to-T’ conversion signal from model oligonucleotide sequence (T5MH) before treatment, after 5hmC labeling, and after both 5hmC and 5mC labeling. c, Schematics of chemical labeling for 5hmC and 5mC. d, qPCR result of lambda DNA (25,086bp-25,308bp) before and after potassium ruthenate (K2RuO4) treatment; n = 3 (Treated), n = 3 (Control). Data are presented as 6.020 ± 0.006 (Treated) and 5.957 ± 0.026 (Control). e, Agarose gel images of dsDNA of fragmented lambda DNA treated with 5hmC-labelling reaction only (left panel) and sequential 5hmC and 5mC-labelling (right panel). Experiment was performed once. f, Barplot showing the distribution of C-to-T mutation rates for unmethylated CH sites and methylated CG sites. g, C-to-T mutation signals on both strands of T1M spike-in model DNA, symmetric methylated CG sites are indicated in red on the sequences below. h, Genome browser view showing the sequenced reads aligned to spike-in model DNA (upper: T1M with 5mCG, positive and negative strands are separately displayed, bottom left: T2H with a single 5hmCG site, bottom right: T3MH with both 5mC site and 5hmCG site at known position). C-to-T is colored in red for positive strands, and G-to-A is colored in green for negative strands, respectively. Conversion rates are estimated from all the modified cytosines of spike-in model DNA. i, Nuclei clumps and resolved single nuclei suspension after brief sonication under bright field microscope. Experiment was performed once. Scale bars, 100 μm. j, Enrichment analysis of genome coverage by SIMPLE-seq and DNase-seq on DHS (DNase I hypersensitive sites). k, A table showing tagmentation efficiency under different Tn5 reaction conditions, including Tn5 with hyperactive mutations, working concentrations and reaction buffers; right-sided showing the fragments analysis result under optimal condition. l-n, Standard curves for mixed oligo DNA with (l) 5mC and C, (m) 5hmC and C, (n) 5mC and 5hmC, were plotted based on gradient mixing ratios (0:10; 2:8; 4:6; 6:4; 8:2; 10:0). o, C-to-T mutation rate estimated from 10-kb non-overlap bins across the whole genome. For both boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× interquartile range (IQR). For 5mC, minima = 1.12%, maxima = 1.47%; for 5hmC, minima = 0.75%, maxima = 2.43%. 5mC, n = 156,755; 5hmC, n = 159,962. p and q, Scatter plot showing (p) the number of 5hmC reads mapped to human and mouse genome and (q) the fraction of 5mC and 5hmC reads mapped to human and mouse genome in each cell from the species-mixing experiment with twin axes. r, Stacked barplot showing the fraction of reads mapped to the reference genome and assigned to 5mC, 5hmC, or cannot be assigned. s-v, Comparisons between SIMPLE-seq and published single-cell and bulk 5mC and 5hmC sequencing methods: (s) fraction of mappable reads, (t) the number of 5mCG sites detected and total sequenced reads in each study, and (u) the number of 5hmCG sites detected and total sequenced reads in each study, (v) Dot plot showing the average number of covered CGs and sequenced reads for each cell in each study. w, Line plots showing the number of unique mapped reads or CG dinucleotides with at different sequenced read depths per cell in each study.
Extended Data Fig. 2 Joint analysis of 5mC and 5hmC from single cells.
a, Barplot showing the enrichment of 5mCG and 5hmCG sites detected by SIMPLE-seq, WGBS and TAB-seq over different genomic regions. b, Boxplots showing the 5mC or 5hmC modification levels on different genomic regions from bisulfite sequencing, TAB-seq and SIMPLE-seq. For all boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. The minima/maxima/numbers of elements of all boxplots: 60.76%/74.60%/157,324 (Alu,(1)), 58.52%/68.61%/137,869 (Alu,(2)), 3.71%/10.71%/37,810 (Alu,(3)), 0.00%/10.91%/8,332 (Alu,(4)), 14.71%/67.84%/5,828 (CGI,(1)), 12.36%/54.72%/3,662 (CGI,(2)), 1.96%/5.87%/1,059 (CGI,(3)), 1.45%/5.24%/304 (CGI,(4)), 6.02%/72.62%/45,014 (Intron,(1)), 8.48%/79.35%/30,294 (Intron,(2)), 4.19%/5.01%/7,333 (Intron,(3)), 3.41%/4.85%/1,669 (Intron,(4)), 70.19%/75.69%/124,777 (L1,(1)), 64.68%/94.66%/103,247 (L1,(2)), 6.89%/10.32%/22,350 (L1,(3)), 4.90%/8.53%/5,414 (L1,(4)), 67.40%/75.88%/26,295 (L2,(1)), 68.26%/87.71%/18,819 (L2,(2)), 1.86%/7.30%/4,342 (L2,(3)), 2.20%/6.31%/875 (L2,(4)), 64.16%/73.22%/909 (LCP,(1)), 57.34%/79.78%/739 (LCP,(2)), 2.88%/10.32%/171 (LCP,(3)), 2.56%/7.71%/62 (LCP,(4)), 70.15%/75.10%/194,933 (LINE,(1)), 65.93%/81.40%/177,226 (LINE,(2)),3.79%/7.49%/34,376 (LINE,(3)), 3.99%/5.61%/9,412 (LINE,(4)), 68.59%/75.74%/189,543 (LTR,(1)), 60.01%/72.15%/170,132 (LTR,(2)), 2.85%/7.75%/32,147 (LTR,(3)), 3.24%/5.59%/10,073 (LTR,(4)), 68.60%/74.58%/47,771 (MIR,(1)), 66.88%/85.86%/30,355 (MIR,(2)), 2.88%/9.75%/7,707 (MIR,(3)), 0.00%/7.62%/1,520 (MIR,(4)), 62.68%/74.58%/325,867 (SINE,(1)), 58.25%/73.48%/302,881 (SINE,(2)), 4.93%/9.37%/56,712 (SINE,(3)), 0.00%/9.52%/19,715 (SINE,(4)), 62.82%/79.57%/4,651 (H3K9me3,(1)), 48.15%/94.31%/3,514 (H3K9me3,(2)), 3.79%/12.09%/862 (H3K9me3,(3)), 0.00%/11.97%/226 (H3K9me3,(4)), 14.28%/69.99%/21,711 (DNase,(1)), 12.70%/70.84%/15,340 (DNase,(2)), 0.00%/10.11%/3,961 (DNase,(3)), 0.00%/6.10%/973 (DNase,(4)). c, Venn plot showing the 5mCG sites overlap between SIMPLE-seq and TAPS. P-value, two-sided Fisher’s exact test. d, Venn plot showing the 5hmCG sites overlap between SIMPLE-seq and TAB-seq. P-value, two-sided Fisher’s exact test. e, Stacked barplot showing the fraction of called 5hmC sites overlapped with 5mC (grey, 5hmC-shared) and 5hmC-sites did not overlapped with a called 5mC sites (blue, 5hmC-only). f, Boxplot showing the 5hmC modification levels of 5mC-5hmC shared sites and the 5hmC-only sites. For both boxplots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. For 5mC-5hmC shared sites, minima = 0.01, maxima = 1.00, sites number n = 622,323, and for 5hmC-only sites, minima = 0.01, maxima = 1.00, sites number n = 435,143. P-value, two-sided Fisher’s exact test. g, Barplot showing the enrichment of 5mC-5hmC shared sites and the 5hmC-only sites over different genomic regions. h, Barplot showing the relative enrichment of 5hmC-only sites over 5mC-5hmC shared sites on different genomic regions. i-k, UMAP embedding showing cells based on their (i) 5mCG, (j) 5mCHG and (k) 5hmCHG levels (in 100-kb non-overlapping bins). Each dot represents a single cell and is colored according to its original identity. l, Assignment of 2i mES cells and serum mES cells into two distinct clusters grouped by unsupervised clustering. m, Silhouette plot to evaluate the degree of separation of the clusters based on 5mC or 5hmC. n, Line plots showing the cumulated coverages of 10-kb non-overlapping bins with different depths for 5mC (green) and 5hmC (blue) from different numbers of single cells. The shadowed area showing the error ranges from 5 randomly sampled cell sets. o, Smoothed line plots showing the 5hmCG levels around genic regions of genes with different expression levels (using the smooth.spline function with parameter df = 30). p-q, Line plots showing the relationships between promoter 5mCG and 5hmCG modification levels with gene expression levels in (p) 2i mES cells, (q) serum mES cells.
Extended Data Fig. 3 Analysis of 5mC and 5hmC from the same molecules revealed multiple 5hmC types.
a, Heat maps showing the 5mCHG and 5hmCHG TF motif enrichments along with TF expression level during mESC 2i state to serum state transition. b, Dotplots showing promoter methylation levels for genes associated with cell proliferation during mESC 2i to serum state transition. c, Cytosine modification entropies in single cells. Each dot represents a single cell and is colored and ordered according to its pseudotime score. Grey dots are background entropy levels estimated by shuffled the cell barcodes of called modification sites. d, Violin plots showing the fraction of 5mC and 5hmC reads with paired modality in single cells. For both Violin plots, hinges were drawn from the 25th to 75th percentiles, with the middle line denoting the median, whiskers with maximum 2× IQR. For 5mC modality, minima = 1.94%, maxima = 9.12%; for 5hmC modality, minima = 6.93%, maxima = 20.75%; n = 300 randomly sampled cells. e, Barplot showing the distributions of 5hmCG sites with different fractions of cells with detected 5mCG-associated 5hmCG sites. False positive detection rates (FDR) based on the fraction of averaged false detected (Type 2) sites from shuffled groups in total detected sites (FDR = 0.0467 was selected). f, UMAP embedding showing 5hmC levels of different types in single cells. Each dot represents a single cell and is colored according to the average 5hmC level. g, Top enriched GREAT GO terms for different types of 5hmCG sites. P-value, one-sided Fisher’s exact test. h, Histograms and heatmaps showing the relationship between two types of 5hmCG with mESC DNase-seq signals from ENCODE (ENCSR000CMW). i, Fraction of Type 1 and Type 2 5hmCGs in detected from low-entropy (0–75%) or high-entropy (75%-100%) cells. j, Barplot showing the enrichment of 5hmCG sites detected in low-entropy or high-entropy cells over different genomic regions.
Extended Data Fig. 4 SIMPLE-seq generates cell-type-specific 5mC and 5hmC profiles from human PBMC.
a, Heatmap showing the gene expression and promoter methylation levels of marker genes for the five major cell types. b and c, UMAP embedding showing the single-cell clustering based on (b) 5mCHG and (c) 5hmCHG levels (in 100-kb non-overlapping bins) from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG.d, Silhouette plot showing the degree of separation of the PBMC clusters based on 5mC, 5hmC or joint 5mCG-5hmCG. e, UMAP embedding showing the single-cell clustering based on joint 5mCG-5hmCG levels (in 100-kb non-overlapping bins) from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG. f and g, UMAP embedding showing the single-cell clustering based on (f) 5mCG levels of H3K4me1 regions (g) 5hmCG levels of H3K4me1 regions from PBMC. Each dot represents a single cell and is colored according to its annotation based on 5mCG. h, Violin plots showing the modification levels of 5mCG, 5hmCG, 5mCHG and 5hmCHG in different cell types. i, The heatmap showing the numbers of pairwise differentially methylated (5mCG) regions across cell types (in 5-kb non-overlapping bins). j, Heatmap showing the methylation levels of 5mCG DMRs of a representative group (CD4+ T cells and Monocytes). k-m, The enrichment analysis of (k) know motifs, (l) top enriched de novo motifs, P-value, one-sided Fisher’s exact test, and (m) top enriched GREAT GO terms, P-value, one-sided Fisher’s exact test.
Extended Data Fig. 5 Comparison of conserved and differential cytosine states among immune cells.
a, Violin plots showing the genomic coverages of the 10 cytosine states. b, Heatmap showing the enrichment of different cytosine state regions around TSS and TES sites. c-e, Scatter plot showing the fraction of genome regions overlapped with peaks of different histone marks in conserved and differential (c) E5 (d) E7 and (e) E9 state regions. P-value, two-sided t-test. f, Top enriched GREAT GO terms for conserved and differential regions of E5, E7 and E9 in representative cell types. P-value, two-sided t-test.
Extended Data Fig. 6 SIMPLE-seq generates cell-type-specific 5mC and 5hmC profiles from mouse brain.
a, Violin plots showing the fraction of reads mapped to reference mouse genome for 5mC and 5hmC in SIMPLE-seq (this study) and Joint-snhmC-seq (GSE236798) datasets. b, Violin plots showing the numbers of unique reads per cell for 5mC and 5hmC in SIMPLE-seq (this study) and Joint-snhmC-seq (GSE236798) datasets. For fair comparisons, Joint-snhmC-seq dataset was down sampled to the same per-cell depth in SIMPLE-seq. Data are presented as 57,563 ± 4,260 (SIMPLE-seq, 5mC), 35,283 ± 552 (joint-snhmC-seq, 5mC), 39,374 ± 4,553 (SIMPLE-seq, 5hmC), 20,538 ± 565 (joint-snhmC-seq, 5hmC). Cell number n = 4,767 (SIMPLE-seq, 5mC and 5hmC), n = 552 (joint-snhmC-seq, 5mC and 5hmC). c and d, UMAP embedding showing the single-cell clustering based on (c) 5mCG and (d) 5hmCG levels (in 100-kb non-overlapping bins) from mouse brain cells. Each dot represents a single cell and is colored according to its annotation based on joint 5mCG-5hmCG clustering. e, Silhouette plot showing the degree of separation of the mouse brain cell clusters based on 5mC, 5hmC or joint 5mCG-5hmCG. f, Dot plots showing the genebody 5hmCG levels of representative marker genes in the detected cell types. g, The distribution of H3K4me1 and H3K27me3 reads densities around the 5mCG sites or 5hmCG sites from EXC1. h, Line plots showing the relationships between gene body 5mCG and 5hmCG levels with gene expression levels in EXC2 and ASC1 cell types i, Heatmap showing the 5mCG modification levels of cell type-specific 5mCG across the 11 cell types. j, Top enriched de novo motifs (left) and top enriched GO terms (right) for each cell type were also shown. P-value, one-sided Fisher’s exact test.
Supplementary information
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bai, D., Zhang, X., Xiang, H. et al. Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq. Nat Biotechnol 43, 85–96 (2025). https://doi.org/10.1038/s41587-024-02148-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-024-02148-9
This article is cited by
-
High-resolution, noninvasive single-cell lineage tracing in mice and humans based on DNA methylation epimutations
Nature Methods (2025)
-
SIMPLE-seq to decode DNA methylation dynamics in single cells
Nature Reviews Genetics (2024)