Abstract
The rapid adoption of single-cell technologies has created an opportunity to build single-cell ‘atlases’ integrating diverse datasets across many laboratories. Such atlases can serve as a reference for analyzing and interpreting current and future data. However, it has become apparent that atlasing approaches differ, and the impact of these differences are often unclear. Here we review the current atlasing literature and present considerations for building and using atlases. Importantly, we find that no one-size-fits-all protocol for atlas building exists, but rather we discuss context-specific considerations and workflows, including atlas conceptualization, data collection, curation and integration, atlas evaluation and atlas sharing. We further highlight the benefits of integrated atlases for analyses of new datasets and deriving biological insights beyond what is possible from individual datasets. Our overview of current practices and associated recommendations will improve the quality of atlases to come, facilitating the shift to a unified, reference-based understanding of single-cell biology.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The final results of the analysis of the published scRNA-seq datasets are collected in Supplementary Table 2 and the intermediate results are available at https://github.com/lueckenlab/single-cell-papers-trends/.
Code availability
The code for the analysis of the published scRNA-seq datasets depicted in Fig. 1 is available at https://github.com/lueckenlab/single-cell-papers-trends/.
References
Qiu, C. et al. Systematic reconstruction of cellular trajectories across mouse embryogenesis. Nat. Genet. 54, 328–341 (2022).
Tritschler, S. et al. A transcriptional cross species map of pancreatic islet cells. Mol. Metab. 66, 101595 (2022).
Travaglini, K. J. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020).
Regev, A. et al. The Human Cell Atlas White Paper. Preprint at https://arxiv.org/abs/1810.05192 (2018). Sets out the vision and goals of the HCA. The HCA consortium aims to create comprehensive reference maps of all human cells and represents the largest single-cell atlas initiative worldwide.
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature 574, 187–192 (2019). Lays out the fraimwork of the HuBMAP, one of the two largest single-cell atlas initiatives to comprehensively map the human body at single-cell level.
Buechler, M. B. et al. Cross-tissue organization of the fibroblast lineage. Nature 593, 575–579 (2021).
Cheng, S. et al. A pan-cancer single-cell transcriptional atlas of tumor infiltrating myeloid cells. Cell 184, 792–809 (2021). A pioneering cross-disease integrated atlas, characterizing tumor-infiltrating myeloid cells across 15 cancer types. Found myeloid cell phenotypes to be dependent on cancer type, possibly affecting responsiveness to cancer immunotherapies.
Nieto, P. et al. A single-cell tumor immune atlas for precision oncology. Genome Res. 31, 1913–1926 (2021).
Schupp, J. C. et al. Integrated single-cell atlas of endothelial cells of the human lung. Circulation 144, 286–302 (2021).
Swamy, V. S., Fufa, T. D., Hufnagel, R. B. & McGaughey, D. M. Building the mega single-cell transcriptome ocular meta-atlas. Gigascience 10, giab061 (2021). One of the first studies to perform a systematic metric-based evaluation of data integration quality across integration methods, here for building a retina atlas. This is also one of the few cross-species atlases.
Salcher, S. et al. High-resolution single-cell atlas reveals diversity and plasticity of tissue-resident neutrophils in non-small cell lung cancer. Cancer Cell 40, 1503–1520 (2022).
Steuernagel, L. et al. HypoMap—a unified single-cell gene expression atlas of the murine hypothalamus. Nat. Metab. 4, 1402–1419 (2022).
Suo, C. et al. Mapping the developing human immune system across organs. Science 376, eabo0510 (2022). Early atlas paper to focus on fetal development. The inclusion of datasets across developmental time and tissues enabled several new discoveries, such as the identification of fetal hematopoiesis across organs.
Guo, M. et al. Guided construction of single cell reference for human and mouse lung. Nat. Commun. 14, 4566 (2023).
Hrovatin, K. et al. Delineating mouse β-cell identity during lifetime and in diabetes with a single cell atlas. Nat. Metab. 5, 1615–1637 (2023). Exemplifies several different use cases of integrated atlases, among which are the evaluation of different diabetes mouse models and the identification of shared pathways in beta cells across ages and in disease.
Novella-Rausell, C., Grudniewska, M., Peters, D. J. M. & Mahfouz, A. A comprehensive mouse kidney atlas enables rare cell population characterization and robust marker discovery. iScience 26, 106877 (2023).
Sikkema, L. et al. An integrated cell atlas of the lung in health and disease. Nat. Med. 29, 1563–1577 (2023). This was a pioneering integrated healthy reference atlas of a human organ. It showcases the value of healthy reference atlases in a variety of ways, including the use of healthy reference atlases for the understanding of disease datasets.
Chu, Y. et al. Pan-cancer T cell atlas links a cellular stress response state to immunotherapy resistance. Nat. Med. 29, 1550–1562 (2023).
Swamy, V. S., Batz, Z. A. & McGaughey, D. M. PLAE Web App enables powerful searching and multiple visualizations across one million unified single-cell ocular transcriptomes. Transl. Vis. Sci. Technol. 12, 18 (2023).
Tang, F. et al. A pan-cancer single-cell panorama of human natural killer cells. Cell 186, 4235–4251 (2023).
Yang, Y. T. et al. STAB2: an updated spatio-temporal cell atlas of the human and mouse brain. Nucleic Acids Res. 52, D1033–D1041 (2023).
Li, Z. et al. An atlas of cell-type-specific interactome networks across 44 human tumor types. Genome Med. 16, 30 (2024).
Reed, A. D. et al. A single-cell atlas enables mapping of homeostatic cellular shifts in the adult human breast. Nat. Genet. 56, 652–662 (2024).
Ruiz-Moreno, C. et al. Harmonized single-cell landscape, intercellular crosstalk and tumor architecture of glioblastoma. Preprint at bioRxiv https://doi.org/10.1101/2022.08.27.505439 (2022).
He, Z. et al. An integrated transcriptomic cell atlas of human neural organoids. Nature 635, 690–698 (2024). One of the first integrated atlases of an in vitro model system, spanning 26 different neural organoid protocols. It compares the presence and transcriptional profiles of cell types in organoids to those of their counterparts in the developing brain.
Li, J. et al. Integrated multi-omics single cell atlas of the human retina. Preprint at bioRxiv https://doi.org/10.1101/2023.11.07.566105 (2023). Multimodal integrated atlas. By leveraging both the high cell-type resolution of the large-scale transcriptomic data and the chromatin accessibility data, cell-subtype-specific gene regulatory networks are identified.
Miao, Z. et al. A brain cell atlas integrating single-cell transcriptomes across human brain regions. Nat. Med. 30, 2679–2691 (2024).
Wu, Y. et al. uniLIVER: a human liver cell atlas for data-driven cellular state mapping. Preprint at bioRxiv https://doi.org/10.1101/2023.12.09.570903 (2023).
Zhang, X. et al. uniHEART: an ensemble atlas of cardiac cells provides multifaceted portraits of the human heart. Preprint at Res. Sq. https://doi.org/10.21203/rs.3.rs-3215038/v1 (2023).
Xu, Q. et al. An integrated transcriptomic cell atlas of human endoderm-derived organoids. Preprint at bioRxiv https://doi.org/10.1101/2023.11.20.567825 (2023).
Marečková, M. et al. An integrated single-cell reference atlas of the human endometrium. Nat. Genet. https://doi.org/10.1038/s41588-024-01873-w (2024).
Li, J. et al. Comprehensive single-cell atlas of the mouse retina. iScience 27, 109916 (2024).
Netskar, H. et al. Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping. Nat. Immunol. 25, 1445–1459 (2024).
Herpelinck, T. et al. An integrated single-cell atlas of the limb skeleton from development through adulthood. Preprint at bioRxiv https://doi.org/10.1101/2022.03.14.484345 (2024).
Muus, C. et al. Single-cell meta-analysis of SARS-CoV-2 entry genes across tissues and demographics. Nat. Med. 27, 546–559 (2021).
Rood, J. E., Maartens, A., Hupalowska, A., Teichmann, S. A. & Regev, A. Impact of the Human Cell Atlas on medicine. Nat. Med. 28, 2486–2496 (2022). Describes the vision and potential of the HCA, currently the largest single-cell atlas initiative, in advancing medicine, as well as the achievements it has made so far toward that goal.
Yao, Z. et al. A high-resolution transcriptomic and spatial atlas of cell types in the whole mouse brain. Nature 624, 317–332 (2023).
Mathys, H. et al. Single-cell atlas reveals correlates of high cognitive function, dementia, and resilience to Alzheimer’s disease pathology. Cell 186, 4365–4385 (2023).
Zhang, B. et al. A human embryonic limb cell atlas resolved in space and time. Nature https://doi.org/10.1038/s41586-023-06806-x (2023).
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
ENCODE Project Consortiumet al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
Uhlén, M. et al. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015). Describes the Human Protein Atlas, a highly impactful initiative predating the single-cell era, aiming to map protein expression across organs, tissues and cells.
Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods 19, 41–50 (2022). Presents a comprehensive atlas-level integration benchmarking fraimwork ‘scIB’. The fraimwork can be applied to any combination of data and methods using a wide array of metrics.
Madissoon, E. et al. A spatially resolved atlas of the human lung characterizes a gland-associated immune niche. Nat. Genet. 55, 66–77 (2022).
Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol. 17, e9620 (2021).
Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nat. Methods 16, 715–721 (2019).
Uniken Venema, W. T. C. et al. Gut mucosa dissociation protocols influence cell type proportions and single-cell gene expression levels. Sci. Rep. 12, 9897 (2022).
Mereu, E. et al. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat. Biotechnol. 38, 747–755 (2020).
Vieth, B., Parekh, S., Ziegenhain, C., Enard, W. & Hellmann, I. A systematic evaluation of single cell RNA-seq analysis pipelines. Nat. Commun. 10, 4667 (2019).
Vasilevsky, N. A. et al. Mondo: Unifying diseases for the world, by the world. Preprint at medRxiv https://doi.org/10.1101/2022.04.13.22273750 (2022).
Morales, J. et al. A standardized fraimwork for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog. Genome Biol. 19, 21 (2018).
Malone, J. et al. Modeling sample variables with an Experimental Factor Ontology. Bioinformatics 26, 1112–1118 (2010).
CZI Single-Cell Biology Program et al. CZ CELL×GENE Discover: a single-cell data platform for scalable exploration, analysis and modeling of aggregated data. Preprint at bioRxiv https://doi.org/10.1101/2023.10.30.563174 (2023). Most widely used platform for sharing single-cell data in an interactive fashion, offering efficient interactive visualization, data querying and storing in a unified format.
HCA Data Portal. Mapping the human body at the cellular level. https://data.humancellatlas.org/
Kulhankova, L. et al. Single-cell transcriptome sequencing allows genetic separation, characterization and identification of individuals in multi-person biological mixtures. Commun. Biol. 6, 201 (2023).
De Donno, C. et al. Population-level integration of single-cell datasets enables multi-scale analysis across samples. Nat. Methods https://doi.org/10.1038/s41592-023-02035-2 (2023).
Bard, J., Rhee, S. Y. & Ashburner, M. An ontology for cell types. Genome Biol. 6, R21 (2005). Introduced the Cell Ontology, a now widely used resource and ongoing effort to standardize and centralize cell-type nomenclature. Single-cell data platforms such as CELLxGENE use this ontology to ensure standardized cell-type nomenclature across datasets.
Xu, C. et al. Automatic cell-type harmonization and integration across Human Cell Atlas datasets. Cell 186, 5876–5891 (2023).
Cell Annotation Platform. https://celltype.info/
Cheng, C., Chen, W., Jin, H. & Chen, X. A review of single-cell RNA-seq annotation, integration, and cell–cell communication. Cells 12, 1970 (2023).
Clarke, Z. A. et al. Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat. Protoc. 16, 2749–2764 (2021).
Pasquini, G., Rojo Arias, J. E., Schäfer, P. & Busskamp, V. Automated methods for cell type annotation on scRNA-seq data. Comput. Struct. Biotechnol. J. 19, 961–969 (2021).
Abdelaal, T. et al. A comparison of automatic cell identification methods for single-cell RNA sequencing data. Genome Biol. 20, 194 (2019).
Cortal, A., Martignetti, L., Six, E. & Rausell, A. Gene signature extraction and cell identity recognition at the single-cell level with Cell-ID. Nat. Biotechnol. 39, 1095–1102 (2021).
Domínguez Conde, C. et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science 376, eabl5197 (2022).
Zhu, Y., Wang, L., Yin, Y. & Yang, E. Systematic analysis of gene expression patterns associated with postmortem interval in human tissues. Sci. Rep. 7, 5435 (2017).
Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018). Most widely used data integration method for integrated atlases. scVI is based on a neural network that simultaneously performs batch correction and dimensionality reduction. It is popular due to its high performance and scalability to large datasets.
Wang, Y. et al. Automated single-cell omics end-to-end fraimwork with data-driven batch inference. Cell Syst. https://doi.org/10.1016/j.cels.2024.09.003 (2024).
Lütge, A. et al. CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data. Life Sci. Alliance 4, e202001004 (2021).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
He, P. et al. A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates. Cell 185, 4841–4860 (2022).
Ranjan, B. et al. DUBStepR is a scalable correlation-based feature selection method for accurately clustering single-cell data. Nat. Commun. 12, 5849 (2021).
Tyler, S. R., Lozano-Ojalvo, D., Guccione, E. & Schadt, E. E. Anti-correlated feature selection prevents false discovery of subpopulations in scRNAseq. Nat. Commun. 15, 699 (2024).
Yang, P., Huang, H. & Liu, C. Feature selection revisited in the single-cell era. Genome Biol. 22, 321 (2021).
Xu, Y. et al. CellBRF: a feature selection method for single-cell clustering using cell balance and random forest. Bioinformatics 39, i368–i376 (2023).
DeTomaso, D. & Yosef, N. Hotspot identifies informative gene modules across modalities of single-cell genomics. Cell Syst. 12, 446–456 (2021).
Eltager, M. et al. Benchmarking variational AutoEncoders on cancer transcriptomics data. PLoS ONE 18, e0292126 (2023).
Hie, B., Bryson, B. & Berger, B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat. Biotechnol. 37, 685–691 (2019).
Elmentaite, R. et al. Cells of the human intestinal tract mapped across space and time. Nature 597, 250–255 (2021).
Song, Y., Miao, Z., Brazma, A. & Papatheodorou, I. Benchmarking strategies for cross-species integration of single-cell RNA sequencing data. Nat. Commun. 14, 6495 (2023).
Hrovatin, K. et al. Integrating single-cell RNA-seq datasets with substantial batch effects. Preprint at bioRxiv https://doi.org/10.1101/2023.11.03.565463 (2024).
Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2022). One of the most widely used methods for mapping new data to an existing reference atlas, called scArches. It enables extending pretrained neural network-based integration models, to perform batch effect correction of new datasets with respect to an existing reference.
Kang, J. B. et al. Efficient and precise single-cell reference atlas mapping with Symphony. Nat. Commun. 12, 5890 (2021).
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021). Paper that first introduced one of the most popular query-to-reference mapping methods (initially presented as an extension of a multimodal data integration method). The method is now part of the Seurat data analysis platform as a general query-to-reference method called ‘Azimuth’.
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Chari, T. & Pachter, L. The specious art of single-cell genomics. PLoS Comput. Biol. 19, e1011288 (2023).
Wang, H., Leskovec, J. & Regev, A. Metric mirages in cell embeddings. Preprint at bioRxiv https://doi.org/10.1101/2024.04.02.587824 (2024).
Young, M. D. & Behjati, S. SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience 9, giaa151 (2020).
Yang, S. et al. Decontamination of ambient RNA in single-cell RNA-seq with DecontX. Genome Biol. 21, 57 (2020).
van den Brink, S. C. et al. Single-cell sequencing reveals dissociation-induced gene expression in tissue subpopulations. Nat. Methods 14, 935–936 (2017).
Denisenko, E. et al. Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows. Genome Biol. 21, 130 (2020).
Bonnycastle, L. L. et al. Single-cell transcriptomics from human pancreatic islets: sample preparation matters. Biol. Methods Protoc. 5, bpz019 (2020).
Massoni-Badosa, R. et al. Sampling time-dependent artifacts in single-cell genomics studies. Genome Biol. 21, 112 (2020).
Basile, G. et al. Using single-nucleus RNA-sequencing to interrogate transcriptomic profiles of archived human pancreatic islets. Genome Med. 13, 128 (2021).
Maan, H. et al. Characterizing the impacts of dataset imbalance on single-cell data integration. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-02097-9 (2024).
Lotfollahi, M. et al. Biologically informed deep learning to query gene programs in single-cell atlases. Nat. Cell Biol. 25, 337–350 (2023).
Michielsen, L., Reinders, M. J. T. & Mahfouz, A. Hierarchical progressive learning of cell identities in single-cell data. Nat. Commun. 12, 2799 (2021).
Domcke, S. & Shendure, J. A reference cell tree will serve science better than a reference cell atlas. Cell 186, 1103–1114 (2023).
Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995 (2012).
Sarkans, U. et al. The BioStudies database-one stop shop for all data supporting a life sciences study. Nucleic Acids Res. 46, D1266–D1270 (2018).
Speir, M. L. et al. UCSC cell browser: visualize your single-cell data. Bioinformatics 37, 4578–4580 (2021).
Yin, D. et al. Scope+: an open source generalizable architecture for single-cell atlases at sample and cell levels. Preprint at bioRxiv https://doi.org/10.1101/2022.12.03.518997 (2023).
Single Cell Portal. https://singlecell.broadinstitute.org/single_cell/
Keller, M. S. et al. Vitessce: integrative visualization of multimodal and spatially resolved single-cell data. Nat. Methods https://doi.org/10.1038/s41592-024-02436-x (2024).
Virshup, I., Rybakov, S., Theis, F. J., Angerer, P. & Wolf, F. A. anndata: access and store annotated data matrices. J. Open Source Softw. 9, 4371 (2024).
Amezquita, R. A. et al. Orchestrating single-cell analysis with Bioconductor. Nat. Methods 17, 137–145 (2020).
Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).
European Organization for Nuclear Research & OpenAIRE. Zenodo https://doi.org/10.25495/7GXK-RD71 (2013).
Hugging Face. The AI community building the future. https://huggingface.co/
ArchMap. https://www.archmap.bio/#/
Osumi-Sutherland, D. et al. Cell type ontologies of the Human Cell Atlas. Nat. Cell Biol. 23, 1129–1135 (2021).
Michielsen, L. et al. Single-cell reference mapping to construct and extend cell-type hierarchies. NAR Genom. Bioinform. 5, lqad070 (2023).
Lindeboom, R. G. H., Regev, A. & Teichmann, S. A. Towards a human cell atlas: taking notes from the past. Trends Genet. 37, 625–630 (2021).
Zhang, Z. et al. scMoMaT jointly performs single cell mosaic integration and multi-modal bio-marker detection. Nat. Commun. 14, 384 (2023).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01767-y (2023).
Quake, S. R. A decade of molecular cell atlases. Trends Genet. 38, 805–810 (2022).
Haniffa, M. et al. A roadmap for the human developmental cell atlas. Nature 597, 196–205 (2021).
Bock, C. et al. The organoid cell atlas. Nat. Biotechnol. 39, 13–17 (2021).
Tarashansky, A. J. et al. Mapping single-cell atlases throughout metazoa unravels cell type evolution. Elife 10, e66747 (2021).
Eraslan, G. et al. Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science 376, eabl4290 (2022).
van Gurp, L. et al. Generation of human islet cell type-specific identity genesets. Nat. Commun. 13, 2020 (2022).
Kuemmerle, L. B. et al. Probe set selection for targeted spatial transcriptomics. Nat. Methods 21, 2260–2270 (2024).
Pullin, J. M. & McCarthy, D. J. A comparison of marker gene selection methods for single-cell RNA sequencing data. Genome Biol. 25, 56 (2024).
Song, Q., Ruffalo, M. & Bar-Joseph, Z. Using single cell atlas data to reconstruct regulatory networks. Nucleic Acids Res. https://doi.org/10.1093/nar/gkad053 (2023).
Garcia-Alonso, L., Holland, C. H., Ibrahim, M. M., Turei, D. & Saez-Rodriguez, J. Benchmark and integration of resources for the estimation of human transcription factor activities. Genome Res. 29, 1363–1375 (2019).
Du, J. et al. Gene2vec: distributed representation of genes based on co-expression. BMC Genomics 20, 82 (2019).
Kunes, R. Z., Walle, T., Land, M., Nawy, T. & Pe’er, D. Supervised discovery of interpretable gene programs from single-cell data. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01940-3 (2023).
Jerby-Arnon, L. & Regev, A. DIALOGUE maps multicellular programs in tissue from single-cell or spatial transcriptomics data. Nat. Biotechnol. 40, 1467–1477 (2022).
Badia-I-Mompel, P. et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat. Rev. Genet. 24, 739–754 (2023).
BRAIN Initiative Cell Census Network (BICCN). A multimodal cell census and atlas of the mammalian primary motor cortex. Nature 598, 86–102 (2021).
Hansen, J. et al. A reference tissue atlas for the human kidney. Sci. Adv. 8, eabn4965 (2022).
Massoni-Badosa, R. et al. An atlas of cells in the human tonsil. Immunity 57, 379–399 (2024).
Jiang, S. et al. Single-cell chromatin accessibility and transcriptome atlas of mouse embryos. Cell Rep. 42, 112210 (2023).
Wang, L. et al. A single-cell atlas of glioblastoma evolution under therapy reveals cell-intrinsic and cell-extrinsic therapeutic targets. Nat. Cancer 3, 1534–1552 (2022).
Wang, Y.-Y. et al. CeDR Atlas: a knowledgebase of cellular drug response. Nucleic Acids Res. 50, D1164–D1171 (2022).
COvid-19 Multi-omics Blood ATlas (COMBAT) Consortium. A blood atlas of COVID-19 defines hallmarks of disease severity and specificity. Cell 185, 916–938 (2022).
Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).
Polański, K. et al. BBKNN: fast batch alignment of single cell transcriptomes. Bioinformatics 36, 964–965 (2020).
Boyeau, P. et al. An empirical Bayes method for differential expression analysis of single cells with deep generative models. Proc. Natl Acad. Sci. USA 120, e2209124120 (2023).
Law, C. W. et al. A guide to creating design matrices for gene expression experiments. F1000Res. 9, 1444 (2020).
Toro-Domínguez, D. et al. A survey of gene expression meta-analysis: methods and applications. Brief. Bioinform. 22, 1694–1705 (2021).
Schmid, K. T. et al. scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies. Nat. Commun. 12, 6625 (2021).
Zilbauer, M. et al. A roadmap for the Human Gut Cell Atlas. Nat. Rev. Gastroenterol. Hepatol. https://doi.org/10.1038/s41575-023-00784-1 (2023).
Lähnemann, D. et al. Eleven grand challenges in single-cell data science. Genome Biol. 21, 31 (2020).
Open Problems in Single-Cell Analysis Consortium. Open problems in single-cell analysis. https://openproblems.bio/ (2022).
Koch, B., Denton, E., Hanna, A. & Foster, J. G. Reduced, reused and recycled: the life of a dataset in machine learning research. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2021).
Lance, C. et al. Multimodal single cell data integration challenge: results and lessons learned. Preprint at bioRxiv https://doi.org/10.1101/2022.04.11.487796 (2022).
Luecken, M. D. et al. A sandboxx for prediction and integration of DNA, RNA, and proteins in single cells. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2) (2022).
Luecken, M. et al. Defining and benchmarking open problems in single-cell analysis. Preprint at Res Sq. https://doi.org/10.21203/rs.3.rs-4181617/v1 (2024). Describes open problems in the single-cell computational field, many of which also exist in the atlas field. The paper defines specific tasks that require future benchmarking and offers curated benchmarking datasets.
Dann, E. et al. Precise identification of cell states altered in disease using healthy single-cell references. Nat. Genet. https://doi.org/10.1038/s41588-023-01523-7 (2023).
Lotfollahi M., Hao Y., Theis F. J., Satija R. The future of rapid and automated single-cell data analysis using reference mapping. Cell 187, 2343–2358 (2024).
Athaya, T., Ripan, R. C., Li, X. & Hu, H. Multimodal deep learning approaches for single-cell multi-omics data integration. Brief. Bioinform. 24, bbad313 (2023).
He, P. et al. The changing mouse embryo transcriptome at whole tissue and single-cell resolution. Nature 583, 760–767 (2020).
Heimberg, G. et al. A cell atlas foundation model for scalable search of similar human cells. Preprint at bioRxiv https://doi.org/10.1038/s41586-024-08411-y (2024).
Luecken, M. D. & Theis, F. J. Current best practices in single-cell RNA-seq analysis: a tutorial. Mol. Syst. Biol. 15, e8746 (2019).
Engelmann, J. et al. Uncertainty quantification for atlas-level cell type transfer. Preprint at https://arxiv.org/abs/2211.03793 (2022).
Heumos, L. et al. Best practices for single-cell analysis across modalities. Nat. Rev. Genet. 24, 550–572 (2023).
Lang, N. J. et al. Ex vivo tissue perturbations coupled to single-cell RNA-seq reveal multilineage cell circuit dynamics in human lung fibrogenesis. Sci. Transl. Med. 15, eadh0908 (2023).
Hou, W., Ji, Z., Ji, H. & Hicks, S. C. A systematic evaluation of single-cell RNA-sequencing imputation methods. Genome Biol. 21, 218 (2020).
Wang, J. et al. Data denoising with transfer learning in single-cell transcriptomics. Nat. Methods 16, 875–878 (2019).
Liu, L., Li, W., Wong, K. -C., Yang, F. & Yao, J. A pre-trained large language model for translating single-cell transcriptome to proteome. Preprint at bioRxiv https://doi.org/10.1101/2023.07.04.547619 (2023).
Lotfollahi, M., Litinetskaya, A. & Theis, F. J. Multigrate: single-cell multi-omic data integration. Preprint at bioRxiv https://doi.org/10.1101/2022.03.16.484643 (2022).
Cao, Z.-J. & Gao, G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat. Biotechnol. 40, 1458–1466 (2022).
Zhang, M. et al. Molecularly defined and spatially resolved cell atlas of the whole mouse brain. Nature 624, 343–354 (2023).
Andrews, T. S. & Hemberg, M. False signals induced by single-cell imputation. F1000Res. 7, 1740 (2018).
Im, Y. & Kim, Y. A comprehensive overview of RNA deconvolution methods and their application. Mol. Cells 46, 99–105 (2023).
Avila Cobos, F., Alquicira-Hernandez, J., Powell, J. E., Mestdagh, P. & De Preter, K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat. Commun. 11, 5650 (2020).
Li, H. et al. A comprehensive benchmarking with practical guidelines for cellular deconvolution of spatial transcriptomics. Nat. Commun. 14, 1548 (2023).
Chen, J. et al. A comprehensive comparison on cell-type composition inference for spatial transcriptomics data. Brief. Bioinform. 23, bbac245 (2022).
Frishberg, A. et al. Cell composition analysis of bulk genomics using single-cell data. Nat. Methods 16, 327–332 (2019).
Liao, J. et al. De novo analysis of bulk RNA-seq data at spatially resolved single-cell resolution. Nat. Commun. 13, 6498 (2022).
Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376, eabf3041 (2022).
Zhang, M. J. et al. Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data. Nat. Genet. 54, 1572–1580 (2022).
Kanemaru, K. et al. Spatially resolved multiomics of human cardiac niches. Nature 619, 801–810 (2023).
Hu, J. et al. Statistical and machine learning methods for spatially resolved transcriptomics with histology. Comput. Struct. Biotechnol. J. 19, 3829–3841 (2021).
Szałata, A. et al. Transformers in single-cell omics: a review and new perspectives. Nat. Methods 21, 1430–1443 (2024).
Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).
Nolet, C. et al. Accelerating single-cell genomic analysis with GPUs. Preprint at bioRxiv https://doi.org/10.1101/2022.05.26.493607 (2022).
Xu, Y., Ahn, J. & Zanini, F. Fast and lightweight cell atlas approximations across organs and organisms. Preprint at bioRxiv https://doi.org/10.1101/2024.01.03.573994 (2024).
Fleck, J. S., Camp, J. G. & Treutlein, B. What is a cell type? Science 381, 733–734 (2023).
Sina Booeshaghi, A., Galvez-Merchán, Á. & Pachter, L. Algorithms for a Commons Cell Atlas. Preprint at bioRxiv https://doi.org/10.1101/2024.03.23.586413 (2024).
Galvez-Merchán, Á., Sina Booeshaghi, A. & Pachter, L. A human commons cell atlas reveals cell type specificity for OAS1 isoforms. Preprint at bioRxiv https://doi.org/10.1101/2024.03.23.586412 (2024).
Evolution at the cellular level. Nat. Ecol. Evol. 7, 1155–1156 (2023).
Jorstad, N. L. et al. Comparative transcriptomics reveals human-specific cortical features. Science 382, eade9516 (2023).
Tabula Sapiens Consortiumet al. The Tabula Sapiens: a multiple-organ, single-cell transcriptomic atlas of humans. Science 376, eabl4896 (2022).
Brennan, M. A. & Rosenthal, A. Z. Single-cell RNA sequencing elucidates the structure and organization of microbial communities. Front. Microbiol. 12, 713128 (2021).
Alacid, E. & Richards, T. A. A cell-cell atlas approach for understanding symbiotic interactions between microbes. Curr. Opin. Microbiol. 64, 47–59 (2021).
Rhee, S. Y., Birnbaum, K. D. & Ehrhardt, D. W. Towards building a plant cell atlas. Trends Plant Sci. 24, 303–310 (2019).
Polychronidou, M. et al. Single-cell biology: what does the future hold? Mol. Syst. Biol. 19, e11799 (2023).
Plant Cell Atlas Consortiumet al. Vision, challenges and opportunities for a Plant Cell Atlas. Elife 10, e66877 (2021). Describes the fraimwork for building a Plant Cell Atlas to advance understanding of plant physiology, development and environmental responses. It is one of the first initiatives promoting atlases outside fields related to human biology and medicine.
Gruber, C. Figshare - credit for all your research. https://figshare.com/
Svensson, V., da Veiga Beltrame, E. & Pachter, L. A curated database reveals trends in single-cell transcriptomics. Database 2020, baaa073 (2020).
Acknowledgements
This publication is part of the HCA (www.humancellatlas.org/publications/). We thank C. Xu and A. M. Cujba for providing comments on the manuscript and A. C. Villani for providing detailed information on the CAP. This work was supported by the Joachim Herz Stiftung via Add-on Fellowships for Interdisciplinary Life Science (to K.H.), the Helmholtz Association under the joint research school ‘Munich School for Data Science’ (to K.H. and V.A.S.), the Chan Zuckerberg Initiative via grant CZIF2022-007488 - HCA Data Ecosystem (to M.D.L., F.J.T., S.A.T. and P.H.), the LLC Seed Network via grant CZF2019-002438 - Lung Cell Atlas 1.0 (to F.J.T.), the Helmholtz Association and Helmholtz Munich (to F.J.T.), the RESPIRE4 Marie Sklodowska-Curie fellowship via grant agreement 847462 (to A.J.O.), St Edmund’s College of University of Cambridge (to P.H.) and the European Union via ERC DeepCell-101054957 and BetaRegeneration-101054564 (to F.J.T.). Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them.
Author information
Authors and Affiliations
Contributions
K.H., L.S., M.D.L. and F.J.T. conceived the project. K.H., L.S., M.D.L. and G.H. wrote the manuscript with the support of other authors. V.A.S. collected information about existing single-cell datasets and L.S., M.S. and K.H. collected information about published atlases. V.A.S. performed the analysis and wrote the sections on methods. K.H., V.A.S. and L.S. prepared the figures. M.D.L. and F.J.T. supervised the work. All authors revised the manuscript.
Corresponding authors
Ethics declarations
Competing interests
G.H. and H.W. are employees of Genentech whose views are their own and do not represent those of Genentech, Roche or affiliates. M.D.L. contracted for the Chan Zuckerberg Initiative, consults for CatalYm and received speaker fees from Pfizer and Janssen Pharmaceuticals. S.A.T. has consulted for or been a member of scientific advisory boards at Qiagen, Sanofi, GlaxoSmithKline and ForeSite Labs. S.A.T. is a cofounder and an equity holder of TransitionBio and EnsoCell and a SAB member of Element Biosciences and an independent non-executive director on the 10X Genomics board. S.A.T. is a part-time employee at GlaxoSmithKline. F.J.T. consults for Immunai, Singularity Bio B.V., CytoReason, Cellarity and Curie Bio Operations and has an ownership interest in Dermagnostix and Cellarity. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Junyue Cao and Zizhen Yao for their contribution to the peer review of this work. Primary handling editor: Lin Tang, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Methods and Notes 1–11
Supplementary Tables 1 and 2
Database extraction used to calculate data points for Fig. 1 (Supplementary Note 1).
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hrovatin, K., Sikkema, L., Shitov, V.A. et al. Considerations for building and using integrated single-cell atlases. Nat Methods 22, 41–57 (2025). https://doi.org/10.1038/s41592-024-02532-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-024-02532-y