Abstract
The integration of DNA methylation and transcriptional state within single cells is of broad interest. Several single-cell dual- and multi-omics approaches have been reported that enable further investigation into cellular heterogeneity, including the discovery and in-depth study of rare cell populations. Such analyses will continue to provide important mechanistic insights into the regulatory consequences of epigenetic modifications. We recently reported a new method for profiling the DNA methylome and transcriptome from the same single cells in a cancer research study. Here, we present details of the protocol and provide guidance on its utility. Our Smart-RRBS (reduced representation bisulfite sequencing) protocol combines Smart-seq2 and RRBS and entails physically separating mRNA from the genomic DNA. It generates paired epigenetic promoter and RNA-expression measurements for ~24% of protein-coding genes in a typical single cell. It also works for micro-dissected tissue samples comprising hundreds of cells. The protocol, excluding flow sorting of cells and sequencing, takes ~3 d to process up to 192 samples manually. It requires basic molecular biology expertise and laboratory equipment, including a PCR workstation with UV sterilization, a DNA fluorometer and a microfluidic electrophoresis system.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Code availability
Our DNA-methylation analysis pipeline for large bisulfite sequencing data sets, including single-cell RRBS data (https://github.com/aryeelab/dna-methylation-tools), runs in the Google cloud and can be accessed for a fee through the Broad Institute’s Terra platform (https://app.terra.bio/#workspaces/aryee-lab/bisulfite-seq-tools-grch38).
References
Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 6, 377–382 (2009).
Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).
Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).
Lu, S. et al. Probing meiotic recombination and aneuploidy of single sperm cells by whole-genome sequencing. Science 338, 1627–1630 (2012).
Zong, C., Lu, S., Chapman, A. R. & Xie, X. S. Genome-wide detection of single-nucleotide and copy-number variations of a single human cell. Science 338, 1622–1626 (2012).
Ding, J. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 38, 737–746 (2020).
Haque, A., Engel, J., Teichmann, S. A. & Lönnberg, T. A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications. Genome Med. 9, 75 (2017).
Ziegenhain, C. et al. Comparative analysis of single-cell RNA sequencing methods. Mol. Cell 65, 631–643.e4 (2017).
Charlton, J. et al. Global delay in nascent strand DNA methylation. Nat. Struct. Mol. Biol. 25, 327–332 (2018).
Gaiti, F. et al. Epigenetic evolution and lineage histories of chronic lymphocytic leukaemia. Nature 569, 576–580 (2019).
Guo, H. et al. Single-cell methylome landscapes of mouse embryonic stem cells and early embryos analyzed using reduced representation bisulfite sequencing. Genome Res. 23, 2126–2135 (2013).
Smallwood, S. A. et al. Single-cell genome-wide bisulfite sequencing for assessing epigenetic heterogeneity. Nat. Methods 11, 817–820 (2014).
Rotem, A. et al. Single-cell ChIP-seq reveals cell subpopulations defined by chromatin state. Nat. Biotechnol. 33, 1165–1172 (2015).
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
Cusanovich, D. A. et al. Multiplex single cell profiling of chromatin accessibility by combinatorial cellular indexing. Science 348, 910–914 (2015).
Jin, W. et al. Genome-wide detection of DNase I hypersensitive sites in single cells and FFPE tissue samples. Nature 528, 142–146 (2015).
Stevens, T. J. et al. 3D structures of individual mammalian genomes studied by single-cell Hi-C. Nature 544, 59–64 (2017).
Anaparthy, N., Ho, Y. J., Martelotto, L., Hammell, M. & Hicks, J. Single-cell applications of next-generation sequencing. Cold Spring Harb. Perspect. Med. 9, a026898 (2019).
Wang, Y. & Navin, N. E. Advances and applications of single-cell sequencing technologies. Mol. Cell 58, 598–609 (2015).
Navin, N. E. The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).
Dey, S. S., Kester, L., Spanjaard, B., Bienko, M. & van Oudenaarden, A. Integrated genome and transcriptome sequencing of the same cell. Nat. Biotechnol. 33, 285–289 (2015).
Macaulay, I. C. et al. G&T-seq: parallel sequencing of single-cell genomes and transcriptomes. Nat. Methods 12, 519–522 (2015).
Angermueller, C. et al. Parallel single-cell sequencing links transcriptional and epigenetic heterogeneity. Nat. Methods 13, 229–232 (2016).
Hu, Y. et al. Simultaneous profiling of transcriptome and DNA methylome from a single cell. Genome Biol. 17, 88 (2016).
Lee, D. S. et al. Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nat. Methods 16, 999–1006 (2019).
Li, G. et al. Joint profiling of DNA methylation and chromatin architecture in single cells. Nat. Methods 16, 991–993 (2019).
Hou, Y. et al. Single-cell triple omics sequencing reveals genetic, epigenetic, and transcriptomic heterogeneity in hepatocellular carcinomas. Cell Res. 26, 304–319 (2016).
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
Smith, Z. D. et al. Epigenetic restriction of extraembryonic lineages mirrors the somatic transition to cancer. Nature 549, 543–547 (2017).
Meissner, A. et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature 454, 766–770 (2008).
Guo, H. et al. The DNA methylation landscape of human early embryos. Nature 511, 606–610 (2014).
Klughammer, J. et al. The DNA methylation landscape of glioblastoma disease progression shows extensive heterogeneity in time and space. Nat. Med. 24, 1611–1624 (2018).
Schrott, R. et al. Cannabis use is associated with potentially heritable widespread changes in autism candidate gene DLGAP2 DNA methylation in sperm. Epigenetics 15, 161–173 (2020).
Stryjewska, A. et al. Zeb2 regulates cell fate at the exit from epiblast state in mouse embryonic stem cells. Stem Cells 35, 611–625 (2017).
Szymczak, S. et al. DNA methylation QTL analysis identifies new regulators of human longevity. Hum. Mol. Genet. 29, 1154–1167 (2020).
Clark, S. J. et al. Genome-wide base-resolution mapping of DNA methylation in single cells using single-cell bisulfite sequencing (scBS-seq). Nat. Protoc. 12, 534–547 (2017).
Luo, C. et al. Robust single-cell DNA methylome profiling with snmC-seq2. Nat. Commun. 9, 3824 (2018).
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
Deaton, A. M. & Bird, A. CpG islands and the regulation of transcription. Genes Dev. 25, 1010–1022 (2011).
Macaulay, I. C. et al. Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq. Nat. Protoc. 11, 2081–2103 (2016).
Pastore, A. et al. Corrupted coordination of epigenetic modifications leads to diverging chromatin states and transcriptional heterogeneity in CLL. Nat. Commun. 10, 1874 (2019).
Schubeler, D. Function and information content of DNA methylation. Nature 517, 321–326 (2015).
Neri, F. et al. Intragenic DNA methylation prevents spurious transcription initiation. Nature 543, 72–77 (2017).
Guo, H. et al. Profiling DNA methylome landscapes of mammalian cells with single-cell reduced-representation bisulfite sequencing. Nat. Protoc. 10, 645–659 (2015).
Boyle, P. et al. Gel-free multiplexed reduced representation bisulfite sequencing for large-scale DNA methylation profiling. Genome Biol. 13, R92 (2012).
Hagemann-Jensen, M. et al. Single-cell RNA counting at allele and isoform resolution using Smart-seq3. Nat. Biotechnol. 38, 708–714 (2020).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Cao, J. et al. Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357, 661–667 (2017).
Mulqueen, R. M. et al. Highly scalable generation of DNA methylation profiles in single cells. Nat. Biotechnol. 36, 428–431 (2018).
Hennig, B. P. et al. Large-scale low-cost NGS library preparation using a robust Tn5 purification and tagmentation protocol. G3 (Bethesda) 8, 79–89 (2018).
Picelli, S. et al. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24, 2033–2040 (2014).
Wang, J. et al. Double restriction-enzyme digestion improves the coverage and accuracy of genome-wide CpG methylation profiling by reduced representation bisulfite sequencing. BMC Genomics 14, 11 (2013).
Martinez-Arguelles, D. B., Lee, S. & Papadopoulos, V. In silico analysis identifies novel restriction enzyme combinations that expand reduced representation bisulfite sequencing CpG coverage. BMC Res. Notes 7, 534 (2014).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
McCarthy, D. J., Campbell, K. R., Lun, A. T. & Wills, Q. F. Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 33, 1179–1186 (2017).
Kangeyan, D. et al. A (fire)cloud-based DNA methylation data preprocessing and quality control platform. BMC Bioinforma. 20, 160 (2019).
Krueger, F. & Andrews, S. R. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics 27, 1571–1572 (2011).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Cavalcante, R. G. & Sartor, M. A. annotatr: genomic regions in context. Bioinformatics 33, 2381–2383 (2017).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Tsankov, A. M. et al. Loss of DNA methyltransferase activity in primed human ES cells triggers increased cell-cell variability and transcriptional repression. Development 146, dev174722 (2019).
Acknowledgements
This work was supported by a Broad Institute SPARC award (to A.G.), NIGMS grant no. P01GM099117 (to A.M. and A.G. and a pilot grant to M.J.A.), National Cancer Institute grant no. R01-CA229902 (to D.A.L.) and the Merkin Institute (to M.J.A.). A.M. is supported by the Max Planck Society. All human embryonic stem cell work was approved by Harvard University’s Embryonic Stem Cell Research Oversight (ESCRO #E00021).
Author information
Authors and Affiliations
Contributions
H.G. and A.G. conceived the method. H.G. developed the laboratory protocol. A.A. engineered and flow-sorted the human ES cells. Z.D.S. provided multi-cell mouse embryonic tissue samples and interpreted data. X.W. provided reagents. A.W.M. and R.C. generated libraries and sequencing data. A.T.R., F.G., D.A.L. and M.J.A. developed computational methods and analyzed sequencing data. D.A.L. and A.M. conceived and led the biological studies that provided the justification and scientific framework for developing and applying this protocol. A.M. and A.G. directed the project. H.G., A.T.R., A.M. and A.G. wrote the manuscript with input from all authors.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Additional information
Peer review information Nature Protocols thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Key references using this protocol
Gaiti, F. et al. Nature 569, 576–580 (2019): https://doi.org/10.1038/s41586-019-1198-z
Pastore, A. et al. Nat. Commun. 10, 1874 (2019): https://doi.org/10.1038/s41467-019-09645-5
Smith, Z. D. et al. Nature 549, 543–547 (2017): https://doi.org/10.1038/nature23891
Extended data
Extended Data Fig. 1 Schematic of steps involved in the workflow of the RRBS data analysis.
The figure is divided into three sections. In the operations section, each box contains a description of the data analysis step, and arrows indicate the progression through the analysis workflow. First, the user can easily run the FastQC analysis, which provides basic sequencing quality metrics, base composition and Illumina-adapter content. The rest of the workflow is developed in the Workflow Description Language format and is available on Terra, a cloud-native platform that runs in the Google cloud. Each row shows the input files that are required for each operation and the format of result reports or files that will be obtained. The last analysis steps (quality-filtering of cells, saturation plot and pseudo bulk analysis) have been described in the main text.
Extended Data Fig. 2 Process for manually dispensing a master mix to a sample plate.
This two-step technique is recommended at several steps of the protocol.
Extended Data Fig. 3 RRBS performance metrics and global CpG methylation across 80 passing cells.
Notched box plots to show the overall and inner-quartile range, median and 95% confidence interval of the number of purity filtered and fully demultiplexed RRBS reads before and after genome alignment (total and unique alignments) (a), the C-to-U bisulfite conversion rate of presumably unmethylated cytosines in non-CpG (CpH) context covered by RRBS reads (b) and the mean methylation level at all CpG sites covered in a given cell (c). d, Venn diagrams showing median and mean numbers of CpG sites covered per passing cell exclusively by reads from the large or small size fraction or covered by both. The approximate total numbers of passing read pairs in 80 passing cells from large and small size fractions were 83 million and 90 million, respectively. Aligned and uniquely aligned read numbers for each size fraction are available in the source data for this figure.
Extended Data Fig. 4 RNA-seq performance metrics across 80 passing cells.
Notched box plots to show the overall and inner-quartile range, median and 95% confidence interval of purity-filtered demultiplexed RNA-seq reads before and after alignment to the genome (total and unique alignments) by using STAR55 v.2.7.3a with the arguments --quantMode GeneCounts --sjdbGTFtagExonParentGene gene_name --outFilterScoreMinOverLread 0.1 --outFilterMatchNminOverLread 0.1 (a), the library size of unique reads aligning to exons (b) and the fraction of exonic reads that map to the mitochondrial genome (c).
Extended Data Fig. 5 Down-sampling and extrapolation of RRBS effort and CpG coverage.
a, The aggregate set of passing RRBS read pairs from 80 single cells was randomly down-sampled in steps from the full data (1 on the x-axis; 1.72 billion total read pairs; mean: 2.15 million per cell) to 0.01. The number of CpGs covered in each cell at each step was determined, and the corresponding saturation curves were normalized by the CpGs covered at the original full sequencing effort, interpolated and plotted as a function of the relative sequencing effort (fraction of reads). The blue dots represent the mean normalized fraction of CpGs at each step. b, The blue dots in the saturation curve are the same down-sampled data points as in a. The insert is a linear regression (R2 = 0.99999) of the actual mean normalized CpGs (blue dots) versus calculated mean normalized CpGs, assuming the saturation curve follows a Michaelis-Menten equation: (Fraction of CpGs)calc = (Fraction of reads/0.6004)/(Fraction of reads + 0.4023/0.6004), whereby 0.6004 and 0.4023 were the slope and y-intercept, respectively, of a Hanes-Woolf linearization of the Michaelis-Menten equation (R2 = 0.9999). The red dots are extrapolated mean normalized fractions of RRBS-covered CpGs. The saturation curve approaches a limit of ~1.67 on the y-axis, corresponding to a mean of 2.3 million CpGs per cell. Extrapolation to 5 on the x-axis results in 2.0 million CpGs per cell on average.
Extended Data Fig. 6 Fraction of all annotated CpG islands, promoters and enhancers covered at three or more CpGs.
Violin box plots show the distribution of the fraction of all annotated CGIs, promoters of protein-coding genes and permissive FANTOM5 enhancers in the human genome with RRBS coverage at three or more CpGs at the single-cell level. Bars indicate the fraction of these genomic features covered by RRBS reads aggregated from all 80 passing single cells.
Extended Data Fig. 7 Comparability of distinct CpG sites and CpG islands across single cells.
Shown on the left y-axes are the absolute numbers and fractions of their respective pseudo-bulk aggregate numbers of common CpGs (red) or CGIs (blue) that can be compared anywhere in the data set across the number of cells indicated on the x-axis. The absolute numbers were the CpGs or CGIs with n or more hits in a matrix of 80 cells versus all CpGs (10,532,278) or all CGIs (26,887) that were hit at least once in data aggregated from all 80 passing cells (i.e., the pseudo-bulk aggregate, which is by definition the value for comparability across n = 1 cell). The minimum coverage threshold for CGIs is one CpG. The y-axis on the right is the ratio of the fractions of comparable CGIs and CpGs.
Extended Data Fig. 8 Genes with the highest RNA expression levels in 80 passing cells.
Tick marks on the horizontal lines denote TPM calculated for each single cell. Genes are ordered top to bottom by the Mean TPM value (blue circles) across 80 single cells. The eleven most highly expressed genes are mitochondrial (MT) genes.
Extended Data Fig. 9 Expression levels of 18 genes associated with pluripotent cells.
Shown are the distribution (violin plots) and single cell values (dots) of expression levels (TPM) on a log2 scale.
Supplementary information
Supplementary Data 1
RRBS adapter oligonucleotides
Supplementary Data 2
UDI and single i7 index RRBS PCR primers
Source data
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Rights and permissions
About this article
Cite this article
Gu, H., Raman, A.T., Wang, X. et al. Smart-RRBS for single-cell methylome and transcriptome analysis. Nat Protoc 16, 4004–4030 (2021). https://doi.org/10.1038/s41596-021-00571-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41596-021-00571-9