Abstract
Estimating climate response to observed and projected increases in atmospheric greenhouse gases usually requires averaging among multiple independent simulations of computationally expensive global climate models to filter out the internal climate variability. Studies have shown that advanced pattern recognition methods allow one to obtain accurate estimates of the forced climate signal from just a handful of such climate realizations. The accuracy of these methods for a fixed ensemble size, however, decreases with an increasing magnitude of the low-frequency, decadal and longer internal climate variability. Here we generalize a previously developed Bayesian methodology of Linear Dynamical Mode (LDM) decomposition for spatially extended time series to enable joint identification and analysis of forced signal and internal variability in ensembles of climate simulations, a methodology dubbed here an ensemble LDM, or ELDM. The new ELDM method is shown to outperform its pattern-recognition competitors by more accurately isolating the forced signal in small ensembles of both toy- and state-of-the-art climate-model simulations. It is able to do so by explicitly recognizing a non-random structure of the internal variability, identified by the ELDM algorithm alongside the optimal forced-signal estimate, which allows one to study possible dynamical connections between the two types of variability. The optimal ELDM filtering provides a unique opportunity for objective intercomparison of decadal and longer climate variability across different global climate models—a task that proved difficult due to uncertainties associated with the noisy character and limited length of historical climate simulations combined with parameter uncertainties of alternative signal-detection methods.
Similar content being viewed by others
Data availability
CESM Large Ensemble data used in this study is publicly available from Climate Data Gateway at the National Center for Atmospheric Research (https://www.earthsystemgrid.org/dataset/ucar.cgd.ccsm4.cesmLE.html). The 20th Century Reanalysis Project dataset (version 2c) is publicly available at the National Oceanic and Atmospheric Administration (NOAA)/Oceanic and Atmospheric Research (OAR)/Earth System Research Laboratories (ESRL) Physical Sciences Laboratory (PSL) website (https://psl.noaa.gov/).
Notes
Note that in a nonlinear dynamical system the statistics of the internal variability so defined may still undergo changes under variable external conditions.
This effect is conceptually very similar to the degeneracy of eigenrotation of a matrix in the presence of similar eigenvalues, for example, in the Principal Component Analysis.
References
Allen MR, Smith LA (1997) Optimal filtering in singular spectrum analysis. Phys Lett A 234:419–428. https://doi.org/10.1016/S0375-9601(97)00559-8
Barcikowska MJ, Knutson TR, Zhang R (2017) Observed and simulated fingerprints of multidecadal climate variability and their contributions to periods of global sst stagnation. J Clim 30:721–737. https://doi.org/10.1175/JCLI-D-16-0443.1
Compo GP, Whitaker JS, Sardeshmukh PD et al (2011) The twentieth century reanalysis project. Q J R Meteorol Soc. https://doi.org/10.1002/qj.776
DelSole T (2001) Optimally persistent patterns in time-varying fields. J Atmos Sci 58:1341–1356
DelSole T, Tippett MK, Shukla J (2011) A significant component of unforced multidecadal variability in the recent acceleration of global warming. J Clim 24:909–926. https://doi.org/10.1175/2010JCLI3659.1
Deser C, Phillips A (2017) An overview of decadal-scale sea surface temperature variability in the observational record. Past Global Changes Magazine 25(1):2–6. https://doi.org/10.22498/pages.25.1.2
Deser C, Terray L, Phillips AS (2016) Forced and internal components of winter air temperature trends over north America during the past 50 years: Mechanisms and implications*. J Clim 29:2237–2258. https://doi.org/10.1175/JCLI-D-15-0304.1
Dommenget D, Latif M (2008) Generation of hyper climate modes. Geophys Res Lett. https://doi.org/10.1029/2007GL031087
Eyring V, Bony S, Meehl GA et al (2016) Overview of the coupled model intercomparison project phase 6 (CMIP6) experimental design and organization. Geosci Model Develop 9(5):1937–1958
Farneti R, Molteni F, Kucharski F (2014) Pacific interdecadal variability driven by tropical-extratropical interactions. Clim Dyn 42:3337–3355. https://doi.org/10.1007/s00382-013-1906-6
Frankcombe LM, England MH, Kajtar JB et al (2018) On the choice of ensemble mean for estimating the forced signal in the presence of internal variability. J Clim 31(14):5681–5693
Frankignoul C, Gastineau G, Kwon YO (2017) Estimation of the SST response to anthropogenic and external forcing and its impact on the Atlantic multidecadal oscillation and the pacific decadal oscillation. J Clim 30:9871–9895. https://doi.org/10.1175/JCLI-D-17-0009.1
Gavrilov A, Mukhin D, Loskutov E et al (2016) Method for reconstructing nonlinear modes with adaptive structure from multidimensional data. Chaos 26(12):123,101. https://doi.org/10.1063/1.4968852
Gavrilov A, Seleznev A, Mukhin D et al (2019) Linear dynamical modes as new variables for data-driven ENSO forecast. Clim Dyn 52(3–4):2199–2216. https://doi.org/10.1007/s00382-018-4255-7
Gavrilov A, Kravtsov S, Mukhin D (2020) Analysis of 20th century surface air temperature using linear dynamical modes. Chaos. https://doi.org/10.1063/5.0028246
Gavrilov A, Loskutov E, Feigin A (2022) Data-driven stochastic model for cross-interacting processes with different time scales. Chaos. https://doi.org/10.1063/5.0077302
Hannachi A, Jolliffe IT, Stephenson DB (2007) Empirical orthogonal functions and related techniques in atmospheric science: a review. Int J Climatol 27(9):1119–1152. https://doi.org/10.1002/joc.1499
Henley BJ, Gergis J, Karoly DJ et al (2015) A tripole index for the interdecadal pacific oscillation. Clim Dyn 45:3077–3090. https://doi.org/10.1007/s00382-015-2525-1
Jolliffe IT (1986) Principal component analysis. Springer series in statistics, 2nd edn. Springer, New York. 10.1007/978-1-4757-1904-8
Kay JE, Deser C, Phillips A et al (2015) The community earth system model (CESM) large ensemble project: a community resource for studying climate change in the presence of internal climate variability. Bull Am Meteor Soc 96(8):1333–1349. https://doi.org/10.1175/BAMS-D-13-00255.1
Kravtsov S (2012) An empirical model of decadal enso variability. Clim Dyn 39:2377–2391. https://doi.org/10.1007/s00382-012-1424-y
Kravtsov S (2017) Comment on comparison of low-frequency internal climate variability in cmip5 models and observations. J Clim 30(23):9763–9772
Kravtsov S (2017) Pronounced differences between observed and cmip5-simulated multidecadal climate variability in the twentieth century. Geophys Res Lett 44:5749–5757. https://doi.org/10.1002/2017GL074016
Kravtsov S, Callicutt D (2017) On semi-empirical decomposition of multidecadal climate variability into forced and internally generated components. Int J Climatol 37(12):4417–4433. https://doi.org/10.1002/joc.5096
Kravtsov S, Grimm C, Gu S (2018) Global-scale multidecadal variability missing in state-of-the-art climate models. npj Clim Atmosp Sci 1(1):34. https://doi.org/10.1038/s41612-018-0044-6
Kravtsov S, Gavrilov A, Buyanova M et al (2022) Forced signal and predictability in a prototype climate model: implications for fingerprinting based detection in the presence of multidecadal natural variability. Chaos Interdiscip J Nonlinear Sci 10(1063/5):0106514. https://doi.org/10.1063/5.0106514
Maher N, Milinski S, Suarez-Gutierrez L et al (2019) The max Planck institute grand ensemble: enabling the exploration of climate system variability. J Adv Model Earth Syst 11:2050–2069. https://doi.org/10.1029/2019MS001639
Monahan AH (2000) Nonlinear principal component analysis by neural networks: theory and application to the lorenz system. J Clim 13:821–835
Monahan AH, Fyfe JC, Ambaum MHP et al (2009) Empirical orthogonal functions: the medium is the message. J Clim 22:6501–6514
Mukhin D, Gavrilov A, Feigin A et al (2015) Principal nonlinear dynamical modes of climate variability. Sci Rep 5(15):510
Mukhin D, Gavrilov A, Loskutov E et al (2018) Nonlinear reconstruction of global climate leading modes on decadal scales. Clim Dyn 51(5–6):2301–2310. https://doi.org/10.1007/s00382-017-4013-2
Mukhin D, Gavrilov A, Loskutov E et al (2019) Bayesian data analysis for revealing causes of the middle Pleistocene transition. Sci Rep 9(1):7328
Mukhin D, Kravtsov S, Seleznev A et al (2023) Estimating predictability of a dynamical system from multiple samples of its evolution. Chaos Interdiscip J Nonlinear Sci 10(1063/5):0135506. https://doi.org/10.1063/5.0135506
Newman M, Alexander MA, Ault TR et al (2016) The pacific decadal oscillation. J Clim 29(12):4399–4427. https://doi.org/10.1175/JCLI-D-15-0508.1
Scaife AA, Smith D (2018) A signal-to-noise paradox in climate science. npj Clim Atmosp Sci 1:28. https://doi.org/10.1038/s41612-018-0038-4
Schneider T, Griffies SM (1999) A conceptual framework for predictability studies. J Clim 12:3133–3155
Schneider T, Held IM (2001) Discriminants of twentieth-century changes in earth surface temperatures. J Clim 14:249–254
Sippel S, Meinshausen N, Merrifield A et al (2019) Uncovering the forced climate response from a single ensemble member using statistical learning. J Clim 32:5677–5699. https://doi.org/10.1175/JCLI-D-18-0882.1
Smoliak BV, Wallace JM, Lin P et al (2015) Dynamical adjustment of the northern hemisphere surface air temperature field: Methodology and application to observations*. J Clim 28:1613–1629. https://doi.org/10.1175/JCLI-D-14-00111.1
Srivastava A, DelSole T (2017) Decadal predictability without ocean dynamics. Proceed Nat Acad Sci 114(9):2177–2182
Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958. https://doi.org/10.5555/2627435.2670313
Taylor KE, Stouffer RJ, Meehl GA (2012) An overview of CMIP5 and the experiment design. Bull Am Meteor Soc 93(4):485–498. https://doi.org/10.1175/BAMS-D-11-00094.1
Ting M, Kushnir Y, Seager R et al (2009) Forced and internal twentieth-century SST trends in the north Atlantic*. J Clim 22(6):1469–1481. https://doi.org/10.1175/2008JCLI2561.1
Tippett MK, L’Heureux ML (2020) Low-dimensional representations of niño 3.4 evolution and the spring persistence barrier. npj Clim Atmosp Sci 3:1–11. https://doi.org/10.1038/s41612-020-0128-y
Wallace JM, Fu Q, Smoliak BV et al (2012) Simulated versus observed patterns of warming over the extratropical northern hemisphere continents during the cold season. Proceed Nat Acad Sci. https://doi.org/10.1073/pnas.1204875109
Wang C, Deser C, Yu JY, et al (2017) El niño and southern oscillation (ENSO): a review. 10.1007/978-94-017-7499-4_4
Wills RC, Schneider T, Wallace JM et al (2018) Disentangling global warming, multidecadal variability, and el niño in pacific temperatures. Geophys Res Lett 45(5):2487–2496. https://doi.org/10.1002/2017GL076327
Wills RCJ, Battisti DS, Armour KC et al (2020) Pattern recognition methods to separate forced responses from internal variability in climate model ensembles and observations. J Clim 33:8693–8719. https://doi.org/10.1175/JCLI-D-19-0855.1
Wyatt MG, Curry JA (2014) Role for Eurasian arctic shelf sea ice in a secularly varying hemispheric climate signal during the 20th century. Clim Dyn 42:2763–2782. https://doi.org/10.1007/s00382-013-1950-2
Wyatt MG, Kravtsov S, Tsonis AA (2012) Atlantic multidecadal oscillation and northern hemisphere’s climate variability. Clim Dyn 38:929–949. https://doi.org/10.1007/s00382-011-1071-8
Zhang R, Sutton R, Danabasoglu G et al (2019) A review of the role of the Atlantic meridional overturning circulation in Atlantic multidecadal variability and associated climate impacts. Rev Geophys 57(2):316–375. https://doi.org/10.1029/2019RG000644
Acknowledgements
This research was supported by the state assignment of the Institute of Applied Physics of the Russian Academy of Sciences (Project No. FFUF-2022-0008) (ELDM model formulation and training scheme, synthetic example and CESM-LE analysis; Sects. 2, 3, 4 and Appendices C, D). Also, this research was supported by the project #075-02-2023-911 of Program for the Development of the Regional Scientific and Educational Mathematical Center “Mathematics of Future Technologies” (numerical hyperparameter optimization algorithm in Appendices A, B). The authors acknowledge the CESM Large Ensemble Community Project and supercomputing resources provided by NSF/CISL/Yellowstone (Kay et al. 2015). The 20th Century Reanalysis Project dataset (version 2c, Compo et al. 2011) used for comparative estimates in Sect. 3 is provided by the U.S. Department of Energy (DOE), Office of Science Biological and Environmental Research (BER) and by the National Oceanic and Atmospheric Administration Climate Program Office.
Funding
This research was supported by the state assignment of the Institute of Applied Physics of the Russian Academy of Sciences (Project No. FFUF-2022-0008) (ELDM model formulation and training scheme, synthetic example and CESM-LE analysis; Sects. 2, 3, 4 and Appendices C, D). Also, this research was supported by the project #075-02-2023-911 of Program for the Development of the Regional Scientific and Educational Mathematical Center “Mathematics of Future Technologies” (numerical hyperparameter optimization algorithm in Appendices A, B).
Author information
Authors and Affiliations
Contributions
AG: elaborated the idea of applying LDM approach to ensemble climate data, SK, DM and AF: contributed to developing the ELDM method. AG and SK: constructed the synthetic example, and designed the analysis framework. AG: implemented the ELDM algorithm, performed ELDM analysis of the synthetic example, as well as all other supplementary computations. MB: computed ELDMs for CESM-LE sub-ensembles. All authors analysed the results. AG, SK and MB: wrote the first draft of the manuscript, which was then revised by all authors.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: LDM decomposition: details of the cost functions
1.1 The original LDM method
The prior PDF for the LDM patterns \(\textbf{A},\textbf{c}\) is taken to be Gaussian, with the covariance matrix proportional to the (diagonal) sample covariance matrix of the data \(\textbf{X}\) (see Mukhin et al. 2015; Gavrilov et al. 2016):
where \(\lambda _k\) are variances of the input PCs, and the scaling factors \(\sigma _A\) and \(\sigma _c\) are included in the set of the LDM model’s hyperparameters. The prior PDF \(P_{pr}(\sigma )\) for the parameter \(\sigma \) is taken to be constant at the interval \([0, \sigma _{max}]\) and set to zero outside of this interval. The value \(\sigma _{max}\) represents a hypothetical maximum possible value of \(\sigma \) which we define as a square root of the mean variance of the input data \(\textbf{x}_n^{(m)}\) across ensemble, time and space dimensions, thus corresponding to the ELDM-solution with all modes set to zero in Eq. (10). In fact, the choice of any larger value of \(\sigma _{max}\) does not affect optimization results. The prior PDF \(P_{pr}(\textbf{P}\mid \varvec{\tau },\varvec{\sigma _p},d)\) for the LDM time series \(\textbf{P}\) is given by (3) of Sect. 2.1.
Bayesian formalism provides two cost functions—as described by (6) and (7) of section 2.1—to find the optimal values of LDM model (2) parameters \(\varvec{\mu }=(\textbf{A},\textbf{c},\textbf{P},\sigma )\) given the data \(\textbf{X}\) and the prior PDFs with the fixed hyperparameters \(\textbf{H}=(d,\varvec{\tau },\varvec{\sigma }_p,\sigma _A,\sigma _c)\). The conditional PDFs on the right-hand side of these cost functions are given by
and
Note that the likelihood function (A2) is directly based on the LDM model (2).
To obtain the solution maximizing the functionals (6) and (7), we have used a modified and improved version of the numerical algorithm originally developed in Gavrilov et al. (2016) and Mukhin et al. (2015).
1.2 The new ELDM method
In our present, modified ELDM method, the same Bayesian formalism is applied to find the optimal values of the parameters \(\varvec{\mu }=(\textbf{A},\textbf{B},\textbf{c},\textbf{P},\textbf{F},\sigma )\). The prior PDFs for \(\textbf{A},\textbf{c},\sigma \) are the same as in the original LDM method—see (A1)—except that \(\lambda _k\) are now associated with the diagonalized ensemble covariance matrix, the prior PDFs for \(\textbf{P}\) and \(\textbf{F}\) are given by (11) and (12), while the prior PDF for \(\textbf{B}\) is completely analogous to (A1):
where an additional hyperparameter \(\sigma _B\) is introduced to be optimized alongside the others. Thus, the full set of hyperparameters is \(\textbf{H}=(d,d_f,\varvec{\tau },\varvec{\tau _f},\varvec{\sigma }_p,\varvec{\sigma _f},\sigma _A,\sigma _B,\sigma _c)\). The Bayesian cost functions (6), (7) are now maximized using the following expressions in lieu of (A2),(A3):
The modified ELDM procedure provides, among other things, the forced response patterns \(\textbf{B}\) and time series \(\textbf{F}\) that are internally consistent with the other parameters of the LDM decomposition, which is an improvement over the traditional strategy of removing the forced signal from the data using linear regression methods prior to the analysis. Furthermore, and more importantly, the ELDM method assumes no ad hoc spatiotemporal orthogonality constraints among the modes of forced and internal variability, which permits a potentially revealing analysis of the dynamical relationship between these two types of variability.
Appendix B: Optimal dimensions of the ELDM estimated subspaces of forced and internal variability in CESM-LE
The optimal dimensions d and \(d_f\) of the subspaces associated with the ELDM estimated internal and forced variability are obtained by trial and error. Namely, for a given pair \((d,\, d_f)\) we maximize Bayesian evidence \(P(\textbf{X}\mid d,d_f,\tilde{\textbf{H}})\) [the cost function (6)] over all the other hyperparameters \(\tilde{\textbf{H}}=(\varvec{\tau },\varvec{\tau _f},\varvec{\sigma }_p,\varvec{\sigma _f},\sigma _A,\sigma _B,\sigma _c)\) using the algorithm described in Gavrilov et al. (2020). In fact, instead of the expression (6) we work with its logarithm divided by \(N_T N_R\) to define the evidence score for every pair \((d,\, d_f)\):
At fixed \(d_f\), these evidence scores (Fig. 10) increase with d and reach a plateau at some value of d. The score associated with the latter plateau increases with \(d_f\) as well, but also saturates at some value of \(d_f\). The optimal pair \((d,\,d_f)\) is chosen to correspond to the minimum values at which the maximum (plateaued) evidence score is reached. In Fig. 10 this occurs at \(d=d_f=3\). Further increase of d or \(d_f\) only results in the identification of the null modes of internal variability or forced response, which capture no variance and should be omitted.
Appendix C: Comparison of ELDM and S/NP methods performance in estimating the forced signal in CESM-LE
We mentioned in Sect. 3.2 that the spread (across the four CESM-LE 10-member sub-ensembles) of the ELDM estimated forced signal based on all of the three ELDM modes is less than that of the signal based on the two ELDM modes, which further demonstrates the optimality of the \(d_f=3\) forced-signal dimension returned by the ELDM algorithm. We computed this spread as the (time and space averaged) mean squared error (MSE) of the 4 estimated forced-signal time series with respect to their ensemble-mean time series, and expressed the total MSE as a fraction of the total variance of the reference forced signal (see below), which gave the MSE of \(4.7\%\) and \(4\%\) for the two-mode and three-mode ELDM based forced signal estimates. By comparison, the MSE spreads of the S/NP-method forced-signal estimates based on the two and three leading S/NP modes are 3.7 vs. \(6.9\%\), arguing for the optimality of the 2-mode S/NP representation of the forced signal.
The above measure of the forced-signal uncertainty has, however, more to do with the statistical significance of the estimated forced signal than with the accuracy of a given method in reconstructing the (unknown) true signal. To gauge the performance each method in doing so, we here define the reference estimate of the forced signal as the 40-member ensemble mean further smoothed by the 3 year boxcar running mean. The choice of the 3 year window size is a compromise between an attempt to increase the signal-to-noise ratio while retaining the short-term forced signals associated with volcanic eruptions. Other window sizes produce qualitatively similar results (not shown).
Figure 11 displays two-dimensional maps of the MSE of the forced signal estimated by each method with respect to the reference forced signal defined above; these MSE values were averaged across the four sub-ensembles considered. The three-mode ELDM estimate of the forced signal performs the best and has the lowest spatially averaged MSE of about \(16\%\) in terms of the total (reference) forced-signal variance. The two-mode S/NP based estimate of the forced signal has a slightly larger, second-best MSE and spatial MSE distribution which is very similar to that of the 3-mode ELDM based estimate. Overall, both methods perform similarly well in reconstructing the reference forced signal in the present example.
We finally speculate that a relatively high MSE of the forced-signal reconstruction (\(16-18\%\), as per the estimates above) may in fact be dominated by the errors of the reference forced signal with respect to the (unknown) true forced signal due to insufficient number (40) of ensemble members used to compute the reference signal. This is particularly apparent in the time series of the reference vs. reconstructed signals in Fig. 12, where the reference signal exhibits interannual undulations (outside of volcanic minima) not easily attributable to a particular forcing, and yet misses a substantial part of the cooling associated with volcanic episodes in some of the latitude bands.
Appendix D: ELDM decomposition of synthetic and CESM data: verifying the reconstructions of time scales and patterns
In this section, we provide further details on the ELDM time series and patterns in support of conclusions put forward in the main text.
Figure 13 shows the autocorrelation functions (ACFs) of the ELDM estimated slow and fast internal modes for the synthetic example of Sect. 3.1, alongside the ACFs of the actual slow and fast signals in the synthetic system, which illustrates the excellent reconstruction of these time scales by the ELDM procedure. The first zero of ACF for both ELDM and System based estimates equals approximately 25 time units for the slow mode and 1.3 time units for the fast mode, which roughly corresponds to the quarter-periods of the respective oscillators.
Similarly, Fig. 14 shows ACFs for the internal modes obtained in the four sub-ensembles of the CESM-LE SAT data to support our conclusion about the robustness of their time scales. In addition, Fig. 15 shows pairwise correlation coefficient between the spatial patterns of these modes shown in Fig. 8. This coefficient was computed as the area-weighted dot-product of the normalized patterns, with normalization defined as a square root of the same dot product of the pattern with itself. High correlations (greater than 0.85) of the patterns for each mode between sub-ensembles signifies the robustness of these patterns. Slightly lower correlation coefficient value for the third mode can be explained by its very low variance (see the main text) and, therefore, lower detectability. To assess statistical significance of these correlations, we generated synthetic values of the correlations associated with the null hypothesis that the patterns tested are randomly drawn from the sub-space of 20 leading EOFs of CESM-LE SAT (with the weights corresponding to the EOF standard deviations). All the correlations in Fig. 15 far exceed the 5% significance level corresponding to the correlation value of 0.53 so computed.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gavrilov, A., Kravtsov, S., Buyanova, M. et al. Forced response and internal variability in ensembles of climate simulations: identification and analysis using linear dynamical mode decomposition. Clim Dyn 62, 1783–1810 (2024). https://doi.org/10.1007/s00382-023-06995-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00382-023-06995-1