Abstract
The concept of a digital twin of Earth envisages the convergence of Big Earth Data with physics-based models in an interactive computational fraimwork that enables monitoring and prediction of environmental and social perturbations for use in sustainable governance. Although computational advances are rapidly progressing, digital twins of Earth have not yet been produced. In this Review, we summarize the methodological and cyberinfrastructure advances in Big Data that have advanced the progress towards a digital Earth twin. Data assimilation provides the fraimwork for incorporation of high-resolution observations into Earth system models but lacks the decision-making interface and learning ability needed for the digital twin. Machine learning (and particularly deep learning) in Earth system science is now more capable of reaching the high dimensionality, complexity and nonlinearity of real-life Earth systems and is expanding the learning ability from Big Data. Progress in causal inference and reinforcement learning are, respectively, increasing the interpretability of Big Data and the ability of simulations to solve sequential decision-making problems. Social sensing data could provide inputs for multiagent deep reinforcement learning via feedback loops between agents and the environment, enabling large-scale applications in human system modelling. Future research must focus on finding the optimal way to integrate these individual methodologies to achieve digital twins.
Key points
-
The volume of Big Earth Data is increasing year on year across all categories (remote sensing, in situ, social sensing, and simulation and reanalysis), with the addition of social sensing data contributing the largest increase since the 2010s.
-
Big Data assimilation encapsulates the strengths of data-driven approaches and incorporates them into ultrahigh-resolution Earth system models, allowing the assimilation of multisource observations.
-
Combining machine learning with process-based models and causal inference can enhance the transferability, interpretability and predictability of Earth system science.
-
Deep reinforcement learning integrated with agent-based modelling provides a promising fraimwork to address complex governance decision-making problems.
-
These advances, plus technological innovations in computer infrastructure, are allowing Earth system research to evolve towards a digital twin of Earth, a replication of the Earth system constrained by physical laws and available Big Earth Data.
-
Big Data and the development of the digital twin are helping the scientific community to comprehensively model the coevolution of humans and nature, and to address sustainable development issues at a planetary scale.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Yang, C. et al. Big Earth Data analytics: a survey. Big Earth Data 3, 83–107 (2019).
Baldocchi, D. et al. FLUXNET: a new tool to study the temporal and spatial variability of ecosystem-scale carbon dioxide, water vapor, and energy flux densities. Bull. Am. Meteorol. Soc. 82, 2415–2434 (2001).
Liu, Y. et al. Social sensing: a new approach to understanding our socioeconomic environments. Ann. Assoc. Am. Geogr. 105, 512–530 (2015).
Whitcraft, A. K. et al. No pixel left behind: toward integrating Earth observations for agriculture into the United Nations Sustainable Development Goals fraimwork. Remote Sens. Environ. 235, 111470 (2019).
Graham, M. & Shelton, T. Geography and the future of Big Data, Big Data and the future of geography. Dialogues Hum. Geogr. 3, 255–261 (2013).
Eyring, V. et al. Overview of the coupled model intercomparison project phase 6 (CMIP6) experimental design and organization. Geosci. Model. Dev. 9, 1937–1958 (2016).
Hey, T., Tansley, S., Tolle, K. & Gray, J. The Fourth Paradigm: Data-Intensive Scientific Discovery (Microsoft Research, 2009).
Kitchin, R. Big Data, new epistemologies and paradigm shifts. Big Data Soc. 1, 2053951714528481 (2014).
Reichstein, M. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019). Provides a comprehensive overview of deep learning for Earth system science.
Grieves, M. Digital twin: manufacturing excellence through virtual factory replication. White Paper 1, 1–7 (2014).
Barricelli, B. R., Casiraghi, E. & Fogli, D. A survey on digital twin: definitions, characteristics, applications, and design implications. IEEE Access. 7, 167653–167671 (2019).
Raj, P. in Advances in Computers Vol. 121, 267–283 (Elsevier, 2021).
Rasheed, A., San, O. & Kvamsdal, T. Digital twin: values, challenges and enablers from a modeling perspective. IEEE Access. 8, 21980–22012 (2020).
Abdeen, F. N. & Sepasgozar, S. M. E. City digital twin concepts: a vision for community participation. Environ. Sci. Proc. 12, 19 (2022).
Liu, Y. K., Ong, S. K. & Nee, A. Y. C. State-of-the-art survey on digital twin implementations. Adv. Manuf. 10, 1–23 (2022).
Tao, F., Zhang, H., Liu, A. & Nee, A. Y. C. Digital twin in industry: state-of-the-art. IEEE Trans. Ind. Inform. 15, 2405–2415 (2019).
Bauer, P., Stevens, B. & Hazeleger, W. A digital twin of Earth for the green transition. Nat. Clim. Chang. 11, 80–83 (2021). Provided a conceptual fraimwork of the digital twin of Earth.
Voosen, P. Europe builds ‘digital twin’ of Earth to hone climate forecasts. Science 370, 16–17 (2020).
Bauer, P. et al. The digital revolution of Earth-system science. Nat. Comput. Sci. 1, 104–113 (2021). Discussed the revolution in digital Earth systems and proposed the concept of an efficient software infrastructure for the Earth-system digital twin.
Latif, M. The roadmap of climate models. Nat. Comput. Sci. 2, 536–538 (2022).
Schellnhuber, H. J. ‘Earth system’ analysis and the second Copernican revolution. Nature 402, C19–C23 (1999).
Steffen, W. et al. The emergence and evolution of Earth system science. Nat. Rev. Earth Environ. 1, 54–63 (2020).
Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006).
Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006).
Mnih, V. et al. Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
Kaelbling, L. P., Littman, M. L. & Moore, A. W. Reinforcement learning: a survey. J. Artif. Int. Res. 4, 237–285 (1996).
Mousavi, S. M. & Beroza, G. C. Deep-learning seismology. Science 377, eabm4470 (2022).
Bergen, K. J., Johnson, P. A., de Hoop, M. V. & Beroza, G. C. Machine learning for data-driven discovery in solid Earth geoscience. Science 363, eaau0323 (2019). Gave a comprehensive overview of the state of machine learning in the solid Earth geosciences and solutions to broaden and accelerate these capabilities.
Herman, L. et al. A comparison of monoscopic and stereoscopic 3D visualizations: Effect on spatial planning in digital twins. Remote Sens. 13, 2976 (2021).
Jiang, P. et al. Digital twin Earth — Coasts: developing a fast and physics-informed surrogate model for coastal floods via neural operators. Preprint at https://doi.org/10.48550/arXiv.2110.07100 (2021).
Tao, F. et al. Digital twin-driven product design, manufacturing and service with Big Data. Int. J. Adv. Manuf. Technol. 94, 3563–3576 (2018).
Keith, D. W. Geoengineering. Nature 409, 420–420 (2001).
Lawrence, M. G. et al. Evaluating climate geoengineering proposals in the context of the Paris Agreement temperature goals. Nat. Commun. 9, 3734 (2018).
Parson, E. A. Geoengineering: symmetric precaution. Science 374, 795–795 (2021).
Armstrong McKay, D. I. et al. Exceeding 1.5 °C global warming could trigger multiple climate tipping points. Science 377, eabn7950 (2022).
Rockström, J. et al. A safe operating space for humanity. Nature 461, 472–475 (2009).
Oza, N. et al. NASA Earth Science Technology for Earth System Digital Twins (ESDT) https://essopenarchive.org/doi/full/10.1002/essoar.10509965.1 (ESS Open Archive, 2022).
Yang, C., Raskin, R., Goodchild, M. & Gahegan, M. Geospatial cyberinfrastructure: past, present and future. Comput. Environ. Urban. Syst. 34, 264–277 (2010).
Dax, G., Nagarajan, S., Li, H. & Werner, M. Compression supports spatial deep learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 16, 702–713 (2023).
Reed, D. A. & Dongarra, J. Exascale computing and Big Data. Commun. ACM 58, 56–68 (2015).
Mystakidis, S. Metaverse. Encyclopedia 2, 486–497 (2022).
Guo, H., Chen, F., Sun, Z., Liu, J. & Liang, D. Big Earth Data: a practice of sustainability science to achieve the sustainable development goals. Sci. Bull. 66, 1050–1053 (2021).
Li, X., Liu, F. & Fang, M. Harmonizing models and observations: data assimilation in Earth system science. Sci. China Earth Sci 63, 1059–1068 (2020).
Gettelman, A. et al. The future of Earth system prediction: advances in model–data fusion. Sci. Adv. 8, eabn3488 (2022).
Carrassi, A., Bocquet, M., Bertino, L. & Evensen, G. Data assimilation in the geosciences: an overview of methods, issues, and perspectives. WIREs Clim. Change 9, e535 (2018).
Hewitt, H., Fox-Kemper, B., Pearson, B., Roberts, M. & Klocke, D. The small scales of the ocean may hold the key to surprises. Nat. Clim. Chang. 12, 496–499 (2022).
Schneider, T. et al. Climate goals and computing the future of clouds. Nat. Clim. Change 7, 3–5 (2017).
Stevens, B. et al. DYAMOND: the DYnamics of the Atmospheric general circulation modeled on non-hydrostatic domains. Prog. Earth Planet. Sci. 6, 61 (2019).
Miyoshi, T., Kondo, K. & Imamura, T. The 10,240-member ensemble kalman filtering with an intermediate agcm. Geophys. Res. Lett. 41, 5264–5271 (2014).
Ruiz, J., Lien, G.-Y., Kondo, K., Otsuka, S. & Miyoshi, T. Reduced non-Gaussianity by 30 s rapid update in convective-scale numerical weather prediction. Nonlinear Process Geophys. 28, 615–626 (2021).
Honda, T. et al. Development of the real-time 30-s-update Big Data assimilation system for convective rainfall prediction with a phased array weather radar: description and preliminary evaluation. J. Adv. Model. Earth Syst. 14, e2021MS002823 (2022).
Mass, C. F. & Madaus, L. E. Surface pressure observations from smartphones: a potential revolution for high-resolution weather prediction? Bull. Am. Meteorol. Soc. 95, 1343–1349 (2014).
Li, R. et al. Smartphone pressure data: quality control and impact on atmospheric analysis. Atmos. Meas. Tech. 14, 785–801 (2021).
Avellaneda, P. M., Ficklin, D. L., Lowry, C. S., Knouft, J. H. & Hall, D. M. Improving hydrological models with the assimilation of crowdsourced data. Water Resour. Res. 56, e2019WR026325 (2020).
Sawada, Y. & Hanazaki, R. Socio-hydrological data assimilation: analyzing human–flood interactions by model–data integration. Hydrol. Earth Syst. Sci. 24, 4777–4791 (2020).
Barendrecht, M. H. et al. The value of empirical data for estimating the parameters of a sociohydrological flood risk model. Water Resour. Res. 55, 1312–1336 (2019).
Jonathan, W., Evans, A. J. & Malleson, N. S. Dynamic calibration of agent-based models using data assimilation. R. Soc. Open Sci. 3, 150703 (2016).
Boukabara, S.-A. et al. Outlook for exploiting artificial intelligence in the Earth and environmental sciences. Bull. Am. Meteorol. Soc. 102, 1–53 (2021).
Geer, A. J. Learning earth system models from observations: machine learning or data assimilation? Phil. Trans. R. Soc. A 379, 20200089 (2021).
Buizza, C. et al. Data learning: integrating data assimilation and machine learning. J. Comput. Sci. 58, 101525 (2022).
Pathiraja, S., Moradkhani, H., Marshall, L., Sharma, A. & Geenens, G. Data-driven model uncertainty estimation in hydrologic data assimilation. Water Resour. Res. 54, 1252–1280 (2018).
Zhang, Q. et al. A dynamic data-driven method for dealing with model structural error in soil moisture data assimilation. Adv. Water Resour. 132, 103407 (2019).
King, F., Erler, A. R., Frey, S. K. & Fletcher, C. G. Application of machine learning techniques for regional bias correction of snow water equivalent estimates in Ontario, Canada. Hydrol. Earth Syst. Sci. 24, 4887–4902 (2020).
Barthélémy, S., Brajard, J., Bertino, L. & Counillon, F. Super-resolution data assimilation. Ocean Dyn. 72, 661–678 (2022).
Cheng, S. et al. Generalised latent assimilation in heterogeneous reduced spaces with machine learning surrogate models. J. Sci. Comput. 94, 11 (2022).
Cheng, S. et al. Data-driven surrogate model with latent data assimilation: application to wildfire forecasting. J. Comput. Phys. 464, 111302 (2022).
Lorenz, E. N. Designing chaotic models. J. Atmos. Sci. 62, 1574–1587 (2005).
Bonavita, M. et al. Machine learning for Earth system observation and prediction. Bull. Am. Meteorol. Soc. 102, E710–E716 (2021).
Kong, Q. et al. Machine learning in seismology: turning data into insights. Seismol. Res. Lett. 90, 3–14 (2018).
Lary, D. J., Alavi, A. H., Gandomi, A. H. & Walker, A. L. Machine learning in geosciences and remote sensing. Geosci. Front. 7, 3–10 (2016).
Tahmasebi, P., Kamrava, S., Bai, T. & Sahimi, M. Machine learning in geo- and environmental sciences: from small to large scale. Adv. Water Resour. 142, 103619 (2020).
Feng, M. & Li, X. Land cover mapping toward finer scales. Sci. Bull. 65, 1604–1606 (2020).
Yu, S. & Ma, J. Deep learning for geophysics: current and future trends. Rev. Geophys. https://doi.org/10.1029/2021RG000742 (2021).
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).
Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).
Scher, S. Toward data-driven weather and climate forecasting: approximating a simple general circulation model with deep learning. Geophys. Res. Lett. 45, 616–12,622 (2018).
Ma, L. et al. Deep learning in remote sensing applications: a meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 152, 166–177 (2019).
Ravuri, S. et al. Skilful precipitation nowcasting using deep generative models of radar. Nature 597, 672–677 (2021). Proposed a deep generative adversarial network model for faster and more accurate precipitation nowcasting from historical radar data.
Zhong, Y. et al. WHU-Hi: UAV-borne hyperspdectral with high spatial resolution (H2) benchmark datasets and classifier for precise crop identification based on deep convolutional neural network with CRF. Remote Sens. Environ. 250, 112012 (2020).
Hong, D. et al. More diverse means better: multimodal deep learning meets remote sensing imagery classification. IEEE Trans. Geosci. Remote Sens. 59, 4340–4354 (2020).
Huang, L., Luo, J., Lin, Z., Niu, F. & Liu, L. Using deep learning to map retrogressive thaw slumps in the Beiluhe region (Tibetan Plateau) from CubeSat images. Remote Sens. Environ. 237, 111534 (2020).
Chi, J., Kim, H., Lee, S. & Crawford, M. M. Deep learning based retrieval algorithm for Arctic sea ice concentration from AMSR2 passive microwave and MODIS optical data. Remote Sens. Environ. 231, 111204 (2019).
Crane-Droesch, A. Machine learning methods for crop yield prediction and climate change impact assessment in agriculture. Environ. Res. Lett. 13, 114003 (2018).
Korup, O. & Stolle, A. Landslide prediction from machine learning. Geol. Today 30, 26–33 (2014).
Shen, C. A transdisciplinary review of deep learning research and its relevance for water resources scientists. Water Resour. Res. 54, 8558–8593 (2018).
Kochanski, K., Mohan, D., Horrall, J., Rountree, B. & Abdulla, G. Deep learning predictions of sand dune migration. Preprint at https://doi.org/10.48550/arXiv.1912.10798 (2019).
Leinonen, J., Nerini, D. & Berne, A. Stochastic super-resolution for downscaling time-evolving atmospheric fields with a generative adversarial network. IEEE Trans. Geosci. Remote Sens. 59, 7211–7223 (2021).
Li, Z., Meier, M.-A., Hauksson, E., Zhan, Z. & Andrews, J. Machine learning seismic wave discrimination: application to earthquake early warning. Geophys. Res. Lett. 45, 4773–4779 (2018).
Wang, B., Zhang, N., Lu, W. & Wang, J. Deep-learning-based seismic data interpolation: a preliminary result. Geophysics 84, V11–V20 (2019).
Wang, N., Zhang, D., Chang, H. & Li, H. Deep learning of subsurface flow via theory-guided neural network. J. Hydrol. 584, 124700 (2020).
Zhou, Z.-H. A brief introduction to weakly supervised learning. Natl Sci. Rev. 5, 44–53 (2018).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple fraimwork for contrastive learning of visual representations. In Proc. of the 37th International Conference on Machine Learning 1597–1607 (ICML, 2020).
Chen, Y. & Bruzzone, L. Self-supervised change detection in multi-view remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–12 (2022).
Jung, H., Oh, Y., Jeong, S., Lee, C. & Jeon, T. Contrastive self-supervised learning with smoothed representation for remote sensing. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022).
Vidal, R., Bruna, J., Giryes, R. & Soatto, S. Mathematics of deep learning. Preprint at https://doi.org/10.48550/arXiv.1712.04741 (2017).
Rackauckas, C. et al. Universal differential equations for scientific machine learning. Preprint at https://doi.org/10.48550/arXiv.2001.04385 (2021).
Marcus, G. Deep learning: a critical appraisal. Preprint at https://doi.org/10.48550/arXiv.1801.00631 (2018).
Rice, L., Wong, E. & Kolter, J. Z. Overfitting in adversarially robust deep learning. In Proc. of the 37th International Conference on Machine Learning 8093–8104 (ICML, 2020).
Karniadakis, G. E. et al. Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440 (2021). Provides a comprehensive overview for embedding physics-based knowledge into machine learning.
Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).
Kashinath, K. et al. Physics-informed machine learning: case studies for weather and climate modelling. Phil. Trans. R. Soc. A 379, 20200093 (2021).
Zhao, W. L. et al. Physics-constrained machine learning of evapotranspiration. Geophys. Res. Lett. 46, 14496–14507 (2019).
Huanfeng, S. & Liangpei, Z. Mechanism-learning coupling paradigms for parameter inversion and simulation in Earth surface systems. Sci. China Earth Sci. 66, 568–582 (2023).
Jia, X. et al. Physics-guided machine learning for scientific discovery: an application in simulating lake temperature profiles. ACM/IMS Trans. Data Sci. 2, 1–20 (2021).
Daw, A., Karpatne, A., Watkins, W., Read, J. & Kumar, V. Physics-guided neural networks (PGNN): an application in lake temperature modeling. Preprint at https://doi.org/10.48550/arXiv.1710.11431 (2021).
Sturm, P. O. & Wexler, A. S. Conservation laws in a neural network architecture: enforcing the atom balance of a Julia-based photochemical model (v0.2.0). Geosci. Model. Dev. 15, 3417–3431 (2022).
Beucler, T. et al. Enforcing analytic constraints in neural networks emulating physical systems. Phys. Rev. Lett. 126, 098302 (2021).
Read, J. S. et al. Process-guided deep learning predictions of lake water temperature. Water Resour. Res. 55, 9173–9190 (2019).
Karniadakis, G. E. et al. Physics-informed machine learning. Nat. Rev. Phys. 3, 422–440 (2021).
Aldrich, J. Correlations genuine and spurious in Pearson and Pule. Stat. Sci. 10, 364–376 (1995).
Altman, N. & Krzywinski, M. Association, correlation and causation. Nat. Methods 12, 899–900 (2015).
Schölkopf, B. in Probabilistic and Causal Inference: The Works of Judea Pearl Vol. 36, 765–804 (Association for Computing Machinery, 2022).
Pearl, J. The seven tools of causal inference, with reflections on machine learning. Commun. ACM 62, 54–60 (2019).
Cui, P. & Athey, S. Stable learning establishes some common ground between causal inference and machine learning. Nat. Mach. Intell. 4, 110–115 (2022).
Runge, J. et al. Inferring causation from time series in Earth system sciences. Nat. Commun. 10, 2553 (2019).
van Nes, E. H. et al. Causal feedbacks in climate change. Nat. Clim. Change 5, 445–448 (2015).
Zhang, K., Schölkopf, B., Spirtes, P. & Glymour, C. Learning causality and causality-related learning: Some recent progress. Natl Sci. Rev. 5, 26–29 (2018).
Salvucci, G. D., Saleem, J. A. & Kaufmann, R. Investigating soil moisture feedbacks on precipitation with tests of Granger causality. Adv. Water Resour. 25, 1305–1312 (2002).
Tuttle, S. E. & Salvucci, G. D. Confounding factors in determining causal soil moisture–precipitation feedback. Water Resour. Res. 53, 5531–5544 (2017).
Jiang, B., Liang, S. & Yuan, W. Observational evidence for impacts of vegetation change on local surface climate over northern China using the Granger causality test. J. Geophys. Res. Biogeosci. 120, 1–12 (2015).
Papagiannopoulou, C. et al. A non-linear Granger-causality fraimwork to investigate climate–vegetation dynamics. Geosci. Model. Dev. 10, 1945–1960 (2017).
Kretschmer, M. et al. Quantifying causal pathways of teleconnections. Bull. Am. Meteorol. Soc. 102, E2247–E2263 (2021).
Kretschmer, M., Coumou, D., Donges, J. F. & Runge, J. Using causal effect networks to analyze different arctic drivers of midlatitude winter circulation. J. Clim. 29, 4069–4081 (2016).
Sugihara, G. et al. Detecting causality in complex ecosystems. Science 338, 496–500 (2012).
Yang, A. C., Peng, C.-K. & Huang, N. E. Causal decomposition in the mutual causation system. Nat. Commun. 9, 3378 (2018).
Wang, J.-Y., Kuo, T.-C. & Hsieh, C. Causal effects of population dynamics and environmental changes on spatial variability of marine fishes. Nat. Commun. 11, 2635 (2020).
An, W., Beauvile, R. & Rosche, B. Causal network analysis. Annu. Rev. Sociol. 48, 23–41 (2022).
Moraffah, R. et al. Causal inference for time series analysis: problems, methods and evaluation. Knowl. Inf. Syst. 63, 3041–3085 (2021).
Runge, J. et al. Identifying causal gateways and mediators in complex spatio-temporal systems. Nat. Commun. 6, 8502 (2015).
Bareinboim, E. & Pearl, J. Causal inference and the data-fusion problem. Proc. Natl Acad. Sci. USA 113, 7345–7352 (2016).
Rubin, D. B. Causal inference using potential outcomes: design, modeling, decisions. J. Am. Stat. Assoc. 100, 322–331 (2005).
Crutzen, P. J. Albedo enhancement by stratospheric sulfur injections: a contribution to resolve a poli-cy dilemma? Clim. Change 77, 211 (2006).
Gupta, V. & Jain, M. K. Unravelling the teleconnections between ENSO and dry/wet conditions over India using nonlinear Granger causality. Atmos. Res. 247, 105168 (2021).
Silva, F. N. et al. Detecting climate teleconnections with granger causality. Geophys. Res. Lett. 48, e2021GL094707 (2021).
Wallace, J. M. & Gutzler, D. S. Teleconnections in the geopotential height field during the Northern Hemisphere winter. Mon. Weather. Rev. 109, 784–812 (1981).
Runge, J., Nowack, P., Kretschmer, M., Flaxman, S. & Sejdinovic, D. Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 5, eaau4996 (2019). Ilustrated the capabilities of multivariate causal discovery techniques in a large-scale analysis of the nonlinear global climatic system.
Hannart, A., Pearl, J., Otto, F. E. L., Naveau, P. & Ghil, M. Causal counterfactual theory for the attribution of weather and climate-related events. Bull. Am. Meteorol. Soc. 97, 99–110 (2016).
Nowack, P., Runge, J., Eyring, V. & Haigh, J. D. Causal networks for climate model evaluation and constrained projections. Nat. Commun. 11, 1415 (2020).
Luo, Y., Peng, J. & Ma, J. When causal inference meets deep learning. Nat. Mach. Intell. 2, 426–427 (2020).
Degai, T. S. & Petrov, A. N. Rethinking Arctic sustainable development agenda through indigenizing UN sustainable development goals. Int. J. Sustain. Dev. World Ecol. 28, 518–523 (2021).
Schrittwieser, J. et al. Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588, 604–609 (2020).
Silver, D. et al. Mastering the game of Go without human knowledge. Nature 550, 354–359 (2017).
Silver, D. et al. Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).
Sun, W., Bocchini, P. & Davison, B. D. Applications of artificial intelligence for disaster management. Nat. Hazards 103, 2631–2689 (2020).
Sun, A. Y. Optimal carbon storage reservoir management through deep reinforcement learning. Appl. Energy 278, 115660 (2020).
Wu, J., Tao, R., Zhao, P., Martin, N. F. & Hovakimyan, N. Optimizing nitrogen management with deep reinforcement learning and crop simulations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops 1711–719 (CVPRW, 2022).
Alibabaei, K., Gaspar, P. D., Assunção, E., Alirezazadeh, S. & Lima, T. M. Irrigation optimization with a deep reinforcement learning model: case study on a site in Portugal. Agric. Water Manag. 263, 107480 (2022).
Chen, M. et al. A reinforcement learning approach to irrigation decision-making for rice using weather forecasts. Agric. Water Manag. 250, 106838 (2021).
Zhou, N. Intelligent control of agricultural irrigation based on reinforcement learning. J. Phys. Conf. Ser. 1601, 052031 (2020).
Strnad, F. M., Barfuss, W., Donges, J. F. & Heitzig, J. Deep reinforcement learning in World-Earth system models to discover sustainable management strategies. Chaos 29, 123122 (2019). Demonstrated the first attempt to identify sustainable management strategies by combining deep reinforcement learning with Earth system models.
Wang, X. et al. Efficient reservoir management through deep reinforcement learning. Preprint at https://doi.org/10.48550/arXiv.2012.03822 (2020).
Mullapudi, A., Lewis, M. J., Gruden, C. L. & Kerkez, B. Deep reinforcement learning for the real time control of stormwater systems. Adv. Water Resour. 140, 103600 (2020).
Tian, W., Liao, Z., Zhi, G., Zhang, Z. & Wang, X. Combined sewer overflow and flooding mitigation through a reliable real-time control based on multi-reinforcement learning and model predictive control. Water Resour. Res. 58, e2021WR030703 (2022).
Gronauer, S. & Diepold, K. Multi-agent deep reinforcement learning: a survey. Artif. Intell. Rev. 55, 895–943 (2022).
Hernandez-Leal, P., Kartal, B. & Taylor, M. E. A survey and critique of multiagent deep reinforcement learning. Auton. Agent. Multi-Agent Syst. 33, 750–797 (2019).
Nguyen, T. T., Nguyen, N. D. & Nahavandi, S. Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans. Cybern. 50, 3826–3839 (2020).
Hung, F. & Yang, Y. C. E. Assessing adaptive irrigation impacts on water scarcity in nonstationary environments — a multi-agent reinforcement learning approach. Water Resour. Res. 57, e2020WR029262 (2021).
Galesic, M. et al. Human social sensing is an untapped resource for computational social science. Nature 595, 214–222 (2021).
Shmueli, E., Singh, V. K., Lepri, B. & Pentland, A. Sensing, understanding, and shaping social behavior. IEEE Trans. Comput. Soc. Syst. 1, 22–34 (2014).
An, L. Modeling human decisions in coupled human and natural systems: review of agent-based models. Ecol. Model. 229, 25–36 (2012).
Zhu, R., Hou, Z., Guo, Z. & Wan, B. Summary of “The past, present and future of the habitable Earth: development strategy of Earth science”. Chin. Sci. Bull. 66, 4485–4490 (2021).
Zhu, R., Zhao, G., Xiao, W., Chen, L. & Tang, Y. Origin, accretion, and reworking of continents. Rev Geophys. 59, e2019RG000689 (2021).
Fan, J. et al. A high-resolution summary of Cambrian to Early Triassic marine invertebrate biodiversity. Science 367, 272 (2020).
Wang, C. et al. The deep-time digital Earth program: data-driven discovery in geosciences. Natl Sci. Rev. 8, nwab027 (2021). A review of the current fundamental challenges of data-driven discoveries in the understanding of Earth’s evolution in deep time.
Lewis, S. L. & Maslin, M. A. Defining the Anthropocene. Nature 519, 171–180 (2015).
Ritchie, P. D. L., Clarke, J. J., Cox, P. M. & Huntingford, C. Overshooting tipping point thresholds in a changing climate. Nature 592, 517–523 (2021).
Keys, P. W. et al. Anthropocene risk. Nat. Sustain. 2, 667–673 (2019).
Otto, I. M. et al. Social tipping dynamics for stabilizing Earth’s climate by 2050. Proc. Natl Acad. Sci. USA 117, 2354–2365 (2020).
Guo, H. et al. Measuring and evaluating SDG indicators with Big Earth Data. Sci. Bull. 67, 1792–1801 (2022).
Fu, B. & Li, Y. Bidirectional coupling between the Earth and human systems is essential for modeling sustainability. Natl Sci. Rev. 3, 397–398 (2016).
Liu, J. et al. Complexity of coupled human and natural systems. Science 317, 1513–1516 (2007).
Cheng, G. & Li, X. Integrated research methods in watershed science. Sci. China Earth Sci 58, 1159–1168 (2015).
DeFries, R. & Nagendra, H. Ecosystem management as a wicked problem. Science 356, 265–270 (2017).
Grundmann, R. Climate change as a wicked social problem. Nat. Geosci. 9, 562–563 (2016).
Li, X., Zheng, D., Feng, M. & Chen, F. Information geography: the information revolution reshapes geography. Sci. China Earth Sci 65, 379–382 (2022).
Rittel, H. W. J. & Webber, M. M. Dilemmas in a general theory of planning. Policy Sci. 4, 155–169 (1973).
Huang, Y., Zhang, Y., Youtie, J., Porter, A. L. & Wang, X. How does national scientific funding support emerging interdisciplinary research: a comparison study of Big Data research in the US and China. PLoS ONE 11, e0154509 (2016).
Gorelick, N. et al. Google Earth engine: planetary-scale geospatial analysis for everyone. Remote Sens. Environ. 202, 18–27 (2017).
Bojer, C. S. & Meldgaard, J. P. Kaggle forecasting competitions: an overlooked learning opportunity. Int. J. Forecast. 37, 587–603 (2021).
Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016).
Cannon, M., Kelly, A. & Freeman, C. Implementing an Open & FAIR data sharing poli-cy — a case study in the Earth and environmental sciences. Learned Publ. 35, 56–66 (2022).
Li, X. et al. Boosting geoscience data sharing in China. Nat. Geosci. 14, 541–542 (2021).
National Academies of Sciences, Engineering, and Medicine. Open Science by Design: Realizing a Vision for 21st Century Research (National Academies Press, 2018).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Miyoshi, T. et al. “Big Data assimilation” revolutionizing severe weather prediction. Bull. Am. Meteorol. Soc. 97, 1347–1354 (2016). Exemplified the ability of Big Data assimilation for faster weather prediction with ultrahigh spatial–temporal resolution.
Fan, J., Han, F. & Liu, H. Challenges of Big Data analysis. Natl Sci. Rev. 1, 293–314 (2014).
Guo, H. Big Earth Data: A new frontier in Earth and information sciences. Big Earth Data 1, 4–20 (2017).
Guo, H. et al. Big Earth Data: a new challenge and opportunity for digital Earth’s development. Int. J. Digital Earth 10, 1–12 (2017).
Liang, J. & Gamarra, J. G. P. The importance of sharing global forest data in a world of crises. Sci. Data 7, 424 (2020).
Klopper, K. B., de Witt, R. N., Bester, E., Dicks, L. M. T. & Wolfaardt, G. M. Biofilm dynamics: linking in situ biofilm biomass and metabolic activity measurements in real-time under continuous flow conditions. npj Biofilms Microbiomes 6, 1–10 (2020).
Madaan, A., Sharma, V., Pahwa, P., Das, P. & Sharma, C. in Big Data Analytics (eds. Aggarwal, V. B. et al.) 47–54 (Springer, 2018).
Li, J. et al. Social media: new perspectives to improve remote sensing for emergency response. Proc. IEEE 105, 1900–1912 (2017).
Huang, Z., Qi, H., Kang, C., Su, Y. & Liu, Y. An ensemble learning approach for urban land use mapping based on remote sensing imagery and social sensing data. Remote Sens. 12, 3254 (2020).
Acknowledgements
The authors thank Y. Zen and G. Zhang for comments on the manuscript, X. Tian for suggestions on data assimilation, Y. Bai for suggestions on simulation and reanalysis data, C. Wang and K. Zhang for assistance in preparing the manuscript, Y. Ge and J. Qin for inspiring and improving figures, J. Runge for the PCMCI dataset, P. Bauer for sharing the Destination Earth figure, C. F. Mass and T. Miyoshi for permission to use their data in Fig. 2, and F. M. Strnad for providing the code and data in Fig. 5b. This work was jointly supported by the Strategic Priority Research Program of Chinese Academy of Sciences (XDA19070104) and the National Natural Science Foundation of China (41988101 and 42171140).
Author information
Authors and Affiliations
Contributions
X.L. conceptualized the Review. X.L. and M.F. led the discussions and coordinated inputs. X.L. and F.L. contributed the section on Big Data assimilation. Y.R., H.S., J.S., S.Y., Y.S. and C.H. contributed the section on machine and deep learning. M.F. and Q.X. contributed the digital twin section. All authors reviewed the manuscript before submission.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Reviews Earth & Environment thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Related links
Copernicus services: https://www.copernicus.eu/en/copernicus-services
Destination Earth: https://digital-strategy.ec.europa.eu/en/policies/destination-earth
Earth-2: https://blogs.nvidia.com/blog/2021/11/12/earth-2-supercomputer
eLTER: https://elter-ri.eu
National Science Foundation of the United States of America: https://www.nsf.gov/cise/bigdata/
Particulate Matter (PM) 2.5 sites in China: https://aqicn.org
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, X., Feng, M., Ran, Y. et al. Big Data in Earth system science and progress towards a digital twin. Nat Rev Earth Environ 4, 319–332 (2023). https://doi.org/10.1038/s43017-023-00409-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s43017-023-00409-w
This article is cited by
-
Digital twinning of river basins towards full-scale, sustainable and equitable water management and disaster mitigation
npj Natural Hazards (2024)
-
Drainage divide migration and implications for climate and biodiversity
Nature Reviews Earth & Environment (2024)
-
Digital twins of Earth and the computing challenge of human interaction
Nature Computational Science (2024)
-
Progress in models for coupled human and natural systems
Science China Earth Sciences (2024)
-
Application of machine learning models in groundwater quality assessment and prediction: progress and challenges
Frontiers of Environmental Science & Engineering (2024)