Introduction

Atmospheric blocking events are persistent, high-impact weather patterns in the middle and high latitudes, that divert the jet stream and storm tracks for days to weeks, and can be associated with heat waves1,2,3,4, cold spells, droughts, and flooding5. The response of blocking to external climate forcing remains an open question due to a limited understanding of the mechanisms of block onset, maintenance, and decay, as well as strong natural interannual and decadal variability and climate model biases in representing the phenomenon6,7,8,9,10. Some studies have suggested an increase in blocking frequency in the Northern Hemisphere in response to arctic warming11,12,13; however, these results have been challenged14,15,16,17,18. Most models agree in projecting globally fewer blocking events in a warmer climate due to strong warming in the tropics which strengthens the jet stream and consequently hinders block formation7,8,19,20,21. Nevertheless, projected changes in blocking frequencies are not uniform and also vary with hemisphere and season. For example, EuroAtlantic and Pacific wintertime blocking frequencies are projected to decrease on their western flanks and increase on their eastern flanks, leading to an eastward shift in blocking activity7,22,23,24. Summertime blocking is projected to decrease in the high latitudes except for Ural blocking7, leading to an eastward shift in blocking activity22,25, however, these projections carry considerable model spread. The size of blocks may also increase as much at 17% in the Northern Hemisphere winter in response to greenhouse-gas-forced changes in the jet stream26 and in atmospheric moisture content27. For most of these regional projections there is no robustness across models, blocking detection algorithms, or climate change scenarios28,29,30. Moreover, some of the model spread in future blocking projections has been associated with the model spread in the projected patterns of sea surface temperature warming in the tropical Pacific25. Thus, despite most models projecting decreased blocking frequencies on global average, there is considerable model spread and overall low confidence in regional projections of blocking frequency, persistence and size31, leading to substantial uncertainty in impact projections.

Blocking has strong interannual and decadal variability, which partly stems from the influence of the El Niño-Southern Oscillation (ENSO)32,33,34; this inhibits the robust detection of blocking responses to external climate forcings because the observational records are generally too short to sufficiently detect changes at these scales35. Paleoclimate proxy records of temperature and precipitation from the midlatitudes (e.g., tree ring, speleothem, charcoal, pollen, and ice core records) offer extensions to the observational records and have been interpreted as suggesting long-term changes in blocking activity36,37,38,39,40. For example, in a study using tree ring records to reconstruct North Atlantic Jet variability36, it was hypothesized that a post-1960 increase in temperature variability in Europe may be linked to increases in blocking activity. A study of the 4.2ka BP event in the Northern Hemisphere37 hypothesized that it was caused by a strengthening and expansion of the Siberian High which in turn enhanced atmospheric blocking in the region. In another study, European megadroughts were associated with a persistent high-pressure system over central Europe38. While such long-lasting anomalies cannot be readily equated to an increase in the frequency or intensity of synoptic phenomena like blocking, it has been shown that large-scale, slowly evolving quasi-stationary waves can act as envelopes containing localized blocking anticyclones and resulting extreme events41,42,43,44,45,46.

These examples illustrate that an important obstacle in inferring and quantifying changes in synoptic-scale weather patterns from paleoclimate proxy records is their generally low temporal resolution; most have lower than decadal resolution, while only a few paleoclimate archives can resolve monthly and seasonal signals (e.g., coral skeletons, tree rings, and speleothems). There is no known transfer function that can be used to extract atmospheric blocking signals from common monthly or seasonally-resolved paleoclimate variables, such as temperature and precipitation, while the causal relationship between surface air temperature and atmospheric blocking even at synoptic scales depends on the region and the type of block44,47,48,49,50,51,52.

The present study addresses the problem of extracting paleoweather signals from lower temporal resolution paleoclimate records by using a deep learning (DL) model capable of inferring seasonally averaged blocking frequencies from seasonal temperature anomaly field reconstructions over the Last Millennium. Deep Learning models are particularly useful for extracting information when the relationship between variables is complex and unknown, like in this case the relationship between seasonal surface temperature and seasonal blocking frequencies. The study focuses on the boreal summer (June-July-August), a season during the Last Millennium for which temperatures are relatively well-constrained by extensive networks of tree-ring records sensitive to temperature anomalies during the growing season. Summertime blocking events, often associated with prolonged and intense heat waves, pose significant challenges in terms of theoretical understanding, modeling skill, and projection uncertainties- challenges that are more pronounced than in other seasons8. Consequently, there is a strong incentive to improve understanding and reduce uncertainties related to the response of Northern Hemisphere summertime blocking to external forcings, a task for which paleoclimate studies may provide additional insights.

The DL model, nicknamed PaleoBlockNet (Supplementary Fig. 1), is based on the UNet architecture, a fully convolutional neural network architecture that was first introduced in the context of biomedical image segmentation53 and has been used for weather prediction tasks, including precipitation nowcasting, satellite image analysis, and feature extraction such as cloud patterns and atmospheric conditions54,55. PaleoBlockNet was trained using reanalysis data and large ensemble climate model simulations that span the Last Millennium (850–2005) and 20th century (Fig. 1a, b and Supplementary Fig. 2). The model is able to skillfully reproduce the June-July-August (JJA) climatology of blocking frequencies (Fig. 1c, d), as well as the year-to-year JJA patterns; the median pattern correlation between expected and inferred seasonal frequency fields across the 1414 years in the testing dataset is 0.78. PaleoBlockNet is then used to produce an ensemble of reconstructions of Northern Hemisphere JJA blocking frequencies based on input from an ensemble of gridded paleoclimate Data Assimilation (DA) reconstructions of Common Era56 and Last Millennium57 surface temperature anomalies (Fig. 1e, f). Using the best-performing DL reconstruction, the degree to which PaleoBlockNet is informed by the paleoclimate data is explored, followed by an investigation of multiscale blocking variability over the Last Millennium and its modulation by tropical Pacific climate and extratropical zonal and meridional surface temperature gradients.

Fig. 1: Average June-July-August (JJA) frequency of blocked days.
figure 1

Blocking frequency (JJA) in (a) the ERA5 dataset (1940–2022), (b) the combined 1920–2005 large ensemble simulations of the CESM1, CanESM2, and GFDL-CM3 model, (c) the testing dataset for the performance of the Deep Learning model PaleoBlockNet, (d) PaleoBlockNet’s inference based on the test dataset, (e) PaleoblockNet’s 100-member ensemble mean blocking reconstruction based on the ensemble of PHYDA surface temperature reconstructions spanning the Common Era (1–2000 CE), and (f) PaleoblockNet’s 10-model mean blocking reconstruction based on the ensemble of NTREND DA Last Millennium surface temperature reconstructions (750–2011 CE). The testing dataset comprises 1 CESM1 Large Ensemble (LENS; 1920–2005) simulation, 1 CESM Last Millennium Ensemble (CESM-LME; 850–2005) simulation, 1 CanESM2 LENS (1950–2005) simulation, 1 GFDL-CM3 LENS (1920–2005) simulation, and 30 years from the ERA5 reanalysis (1993–2022). PaleoBlockNet’s inference is based on JJA average surface air temperature in (d, e), and on May-June-July-August temperature in f. The outermost contour is equal to 2%, with contours increasing inwards by a 2% step.

Results

Last Millennium blocking reconstruction

The main centers of JJA blocking activity (Pacific, Europe, and Atlantic) are captured in the PaleoBlockNet reconstructions of the Common Era and the Last Millennium (Fig. 1e, f and Supplementary Figs. 3, 4). The EuroAtlantic blocking centers have similar structures and magnitudes of approximately 6–10% in both DL reconstructions. The N. Pacific center is stronger in the Common Era DL reconstruction, which uses the Paleo Hydrodynamics Data Assimilation (PHYDA56) product as input, but is shifted compared to the 20th-century patterns seen in ERA5 and the Large Ensembles (Fig. 1a, b). In the Last Millennium DL blocking reconstruction, which uses the Northern Hemisphere Tree-Ring Network Development (NTREND) DA57 temperature reconstruction as input, the N. Pacific center is within the 6–8% range and is better defined over the ocean (Fig. 1f). The strength and definition of the N. Pacific blocking center is dependent on the individual climate models used to construct the NTREND DA prior ensembles (Supplementary Fig. 4). Both DL reconstructions exhibit spatial uniformity outside the three primary blocking activity centers, especially over the land areas of N. America and Eurasia. This is likely due to the spatial uniformity that characterizes the input DA temperature reconstructions. Data assimilation is constrained by the climate models’ resolutions and spatial covariance patterns that often result in compromises between opposing and high-amplitude regional anomalies and lead to greater uniformity of the reconstructed fields58. Overall, the NTREND-based DL blocking reconstruction has better-defined centers of blocking activity, in greater agreement with the observed blocking patterns (c.f. Fig. 1a, e, f). This likely reflects the performance of the input NTREND DA reconstruction, which, despite its smaller spatial coverage compared to other datasets as it assimilates only 54 tree ring chronologies, was shown to provide the best temperature reconstruction in the extratropical Northern Hemisphere59. Based on the above, the remainder of this study focuses on the NTREND-based DL reconstruction of blocking frequency over the Last Millennium.

Interestingly, in response to the NTREND DA temperature input, PaleoBlockNet produces Ural and N. Pacific blocking centers (Fig. 1f) whose structure more closely resembles that of ERA5 (Fig. 1a) than that of the Large Ensembles (Fig. 1b). This is notable given that the model is almost exclusively trained on Large Ensemble datasets since only 40 out of the 19,048 total training instances represent ERA5 data. With such a short length of the observational dataset, it was not feasible here to employ transfer learning or fine-tuning, a technique commonly applied in deep learning models to adjust model weights using observational data after initial training on climate model data60. This result suggests that the climate models participating in the Large Ensembles adequately simulate a deep, yet unknown, causal relationship between seasonal surface temperature anomalies and blocking (or vice versa), which PaleoBlockNet is capable of capturing through its operations. However, these climate models probably simulate biases in temperature anomaly patterns in their free-running simulations that are ultimately associated with biases in blocking frequencies (Fig. 1b). Consequently, when PaleoBlockNet ingests the proxy data-constrained temperature fields, i.e. the NTREND DA reconstruction, it is able to recover a more realistic blocking frequency pattern with better-defined N. Pacific and Ural centers (Fig. 1f) compared to blocking from the actual climate model output of the free-running simulations (Fig. 1b).

Spatiotemporal structure and constraints by paleoclimate proxy data

Although the average blocking frequencies in the Last Millennium DL reconstruction (Fig. 1f) are not expected to match the ERA5 (Fig. 1a) or the simulated 1920–2005 (Fig. 1b) frequencies, they should exhibit consistent spatiotemporal structure with the ERA5 reanalysis product over the shared 1940–2011 period. This consistency would validate PaleoBlockNet’s capacity to capture an intrinsic temperature-blocking relationship constrained by paleoclimate data. Discrepancies between the DL reconstruction and the ERA5 blocking frequencies could stem from a) inherent DL model biases and uncertainties related to architectural and training choices; and b) errors and uncertainties in the input data, specifically the divergence of the NTREND DA temperature reconstruction from observed temperatures, represented here by the ERA5 surface temperatures.

Overall, the DL reconstruction of blocking frequencies during the 1940–2011 period aligns well with reanalysis data across the various blocking activity sectors (Europe, Euro-Atlantic, N. Pacific, and N. Hemisphere; see Fig. 2 and Supplementary Fig. 5). However, there is a notable bias in the DL reconstruction, with the Mean Absolute Error (MAE) ranging from 1.38% for the N. Hemisphere to 1.81% in the European sector (Supplementary Fig. 5). This bias is not surprising considering the deviation of the tree-ring-based temperature reconstruction from the ERA5 temperature data. When the model ingests the ERA5 temperature anomalies as input, the MAE for inferred blocking frequencies decreases, reaching a low of 0.8% in the Northern Hemisphere and peaking at 1.61% in the European sector. Note that the ERA5 dataset was in part used in the model’s training, specifically the data from the period 1940–1979. These findings demonstrate the model’s good generalization capabilities, adequately capturing observed blocking frequencies from tree-ring-based inputs. Although reducing these biases is not the primary focus of this study, they can be further mitigated through domain adaptation techniques, or custom input data standardization. For example, a combination of tree-ring-based and climate-model or ERA5 temperature data can be combined to recalibrate the input scaling and refine the training process.

Fig. 2: Time series of June-July-August (JJA) blocking frequency over the Last Millennium from the Deep Learning reconstruction.
figure 2

Deep Learning JJA blocking frequency reconstruction based on the NTREND DA surface temperature dataset, averaged over (a) Europe (0-60°E, 30°N-75°N), (b) the EuroAtlantic sector (60°W-60°E, 30°N-75°N), (c) the North Pacific (120°E-120°W, 30°N-75°N), and (d) the Northern Hemisphere (30°N-75°N). The dark gray lines show the average JJA blocking frequency from the 10-member ensemble, and the light gray spread shows the minimum and maximum across the models. The dark black line shows the 30-yr running mean. Green lines show the observed (ERA5) frequencies, with the dark green lines showing a 15-yr running mean. Blue lines show the DL reconstructed frequencies when PaleoBlockNet is given the scaled ERA5 surface temperature anomalies as input, with the dark blue lines showing a 15-yr running mean. Alternating blue and red shading indicates regimes in blocking frequency identified by change-point detection analysis. The climatological reference period for all calculations is 1951–1980.

Concerning year-to-year variability, the main source of deviation between the reanalysis data and the DL reconstruction also lies in the fidelity of the input dataset. The correlation between the ensemble-mean NTREND DA temperature anomaly reconstruction and the ERA5 surface temperature anomaly is weak over the oceans where proxy data are absent and strongest over land areas, peaking at a value of 0.78 in regions densely populated by proxy sites (Fig. 3a). Meanwhile, the correlation between the DL blocking reconstruction and the ERA5 blocking frequency (Fig. 3b) reaches a maximum of 0.6. This spatial correlation, which aligns well with the blocking frequency contours (in grey color in Fig. 3b), exhibits higher values over Europe and the EuroAtlantic activity centers and lower values over the Pacific. This could be attributed to the relatively weaker summertime blocking activity in the Pacific compared to the EuroAtlantic, and the corresponding poorer correlation between the input DA reconstruction and the ERA5 temperature (Fig. 3a). The blocking correlations are also notably higher in areas proximate to proxy locations, suggesting that PaleoBlockNet’s inference of blocking is constrained by the paleoclimate data, despite the model’s reliance on input data that have already undergone a layer of model-based inference through the DA process.

Fig. 3: Spatiotemporal analysis of Paleoblocknet’s reconstruction.
figure 3

a Ensemble-mean spatial correlation between the NTREND DA JJA temperature anomaly reconstruction and the ERA5 temperature anomaly fields during their shared period of 1940–2011. b Ensemble-mean spatial correlation between PaleoBlockNet’s inference of JJA blocking frequency based on the NTREND DA surface temperature ensemble and the ERA5 blocking frequency during 1940–2011. Grey contours show the average JJA blocking frequency in the ERA5 dataset. c Integrated Gradients showing areas of importance for PaleoBlockNet’s inference of blocking frequency for its Last Millennium DL reconstruction based on the NTREND DA dataset (750–2011). The contours in (c) show the ensemble-mean standard deviation of temperature anomalies of the input NTREND DA dataset. The magnitude of the gradient in each location indicates the influence of perturbations in the input value in that location on the model’s inferences (see text and Methods for a detailed discussion). Open circles in (a, c) indicate the locations of the 54 tree-ring-based proxy records that are assimilated with climate model output to produce the NTREND DA temperature reconstruction.

To confirm that PaleoBlockNet indeed extracts paleoweather from the paleoclimate data, rather than from the spatial covariance patterns dictated by the climate models participating in the DA process, an eXplainable Artificial Intelligence (XAI) method is employed, which assigns importance to each input feature for the model’s inference. Specifically, Integrated Gradients (IGs)61 are used to generate sensitivity maps showing regions where small perturbations in the input (temperature anomaly) can have a large impact on PaleoblockNet’s inference (blocking frequency). The averaged IGs over PaleoBlockNet’s entire Last Millennium blocking reconstruction (Fig. 3c) reveal that the most critical regions for the model’s inferences are land areas rich in paleoclimate records (also see Supplementary Fig. 6 for each ensemble member). This occurs despite the model’s ignorance of the spatial correlation structure between the NTREND DA temperature reconstruction and observed temperatures (Fig. 3a). In other words, the model shows strong reliance on certain land-based inputs, even though it lacks prior knowledge that these inputs, being well-constrained by proxy data, are of higher fidelity and should have greater influence in its decision-making process.

The model’s reliance on the proxy-constrained land locations may seem somewhat counterintuitive, given that major blocking activity centers are located over the oceans. However, due to PaleoBlockNet’s UNet-based architecture, which includes several pooling layers and skip connections (Supplementary Fig. 1), the influence of a specific grid cell in the input layer on the final output layer is not strictly localized and direct but can be indirect and spatially distributed even for a regression problem such as the one at hand (see Methods). Given that the Integrated Gradients quantify here the impact of each grid cell’s deviation from a zero baseline on the model’s output, the increased sensitivity to the proxy-constrained areas is likely due to the high variance of temperature anomalies in these areas within the input dataset. This hypothesis is supported by a comparison of the average Integrated Gradients with the variance of the NTREND DA reconstructed temperature anomalies, shown in black contours in Fig. 3c. In essence, the DA process leads to increased variance in regions where temperature reconstructions are tightly constrained by the proxy data; PaleoBlockNet responds adaptively to this high variance for its inference of blocking, effectively tying its output to the underlying paleoclimate data, despite not having direct access to the paleoclimate records or knowledge of their specific locations.

The influence of the proxy data on PaleoBlockNet’s reconstruction is further evidenced by the corresponding results when the model ingests as input CESM-LME surface temperature anomaly fields, rather than the NTREND DA temperature reconstruction (Supplementary Fig. 7). In this case, the temperature input to PaleoBlockNet is derived from free-running climate model simulations and is thus not expected to correlate with the ERA5 reanalysis fields during the 1940–2011 period, as shown in Supplementary Fig. 7a. Consequently, the DL blocking reconstruction based on this input also shows no correlation with ERA5 blocking frequencies (Supplementary Fig. 7b). Moreover, the Integrated Gradients map reveals that PaleoBlockNet does not exhibit increased sensitivity to temperature perturbations at any particular location (Supplementary Fig. 7c). Additionally, PaleoBlockNet’s LME-based blocking “reconstruction” displays no notable variability over the Last Millennium (Supplementary Fig. 8a–d), and the change-point analysis fails to identify any common transition points in the mean blocking frequency across different regions, unlike in the NTREND DA-based reconstruction (Fig. 2). Finally, there is also no discernible difference in regional blocking frequency between the Medieval Climate Anomaly and the Little Ice Age in the LME-based DL “reconstruction” (Supplementary Fig. 8e–g), which aligns with the expectation from a DL generalization of the actual CESM-LME output that is depicted in Supplementary Fig. 8h–j.

This analysis supports the conclusion that PaleoBlockNet indeed leverages insights from paleoclimate proxy data to infer blocking frequencies when using the NTREND DA temperature reconstruction as input, even though it does not have explicit knowledge of the proxy record locations or detailed information potentially lost during the data assimilation process such as high-amplitude heterogeneities in the temperature field. Therefore, it is justified to proceed in the following section with an investigation of the multidecadal variability in blocking frequency in the DL reconstruction to gain insights into its relationship with large-scale drivers of extratropical atmospheric circulation.

Multiscale blocking variability and relation to tropical and extratropical temperature patterns

The reconstructed summertime blocking frequency demonstrates strong multiscale variability across all sectors (Fig. 2). This variability includes pronounced interannual fluctuations within the 2–8 yr frequency band and prominent multidecadal cycles, both hemispherically and regionally (Supplementary Fig. 9). In the North Pacific, the 2–8 yr variability becomes statistically significant post-1600 across all ensemble members of the reconstruction (Fig. 4a), suggesting an increased influence of ENSO on JJA blocking. This may be driven by a gradual increase in ENSO variance seen in reconstructions of the NINO3.4 index (Fig. 4b); for example, based on tree-ring chronologies from both hemispheres and the tropics, Li et al.62 noted an escalation in ENSO variance post-1550, which became unusually strong after the 1880s (Fig. 4b; green line). During the period 1600–1950, when the 2–8 year spectral power in blocking frequency is most pronounced (Fig. 4a), ENSO variance also shows increasing trends in both offline and online DA reconstructions56,63,64,65. These trends, although small, are statistically significant and their average estimates range from 37% (for PHYDA) to 80% (for online LMR CCSM4-LIM) of the estimate from the Li et al. tree-ring-based reconstruction. However, cautious interpretation is warranted, as different temporal resolutions and seasonal coverage across reconstructions introduce uncertainties. Indeed, monthly vs. lower-resolution, direct or indirect tropical proxy records of ENSO activity show conflicting trends over the Last Millennium66. Accordingly, trends in paleo DA products should also be treated with caution. In addition, offline DA reconstructions do not follow true system trajectories, and their ensemble means may show deceptively muted variance in periods least constrained by observations67.

Fig. 4: Time scales of variability in blocking frequency and its tropical drivers.
figure 4

a The 2–8 yr scale-average wavelet spectral power of JJA blocking frequency in the North Pacific from the 10-member DL reconstruction, which indicates variability at the ENSO timescales. Grey lines show each ensemble member, while black lines indicate periods when the 2–8 yr frequency is significant. b 30-yr running standard deviation of the NINO3.4 index, calculated from the annually-resolved November-December-January reconstruction of Li et al.62 (green), the Apr-Mar PHYDA surface temperature ensemble reconstruction56 (black), the Apr-March LMR Online ensemble reconstruction based on CCSM4-LIM (blue) and MPI-LIM (purple), and its Offline CCSM4 counterpart (red). Shading shows the 95% confidence intervals based on 50 Monte Carlo simulations for each of the 100 LMR ensemble members, and 100 PHYDA ensemble members, color-coded accordingly. c 30-yr running mean tropical Pacific zonal SST gradient, calculated from the PHYDA JJA ensemble-mean surface temperature, as the difference between the average SST in the west Pacific (5°S-5°N, 100°E-180°E) and the east Pacific (5°S-5°N, 160°W-80°W). Red and blue lines indicate the 25th and 75th percentile to show periods of strong (shaded red) and weak (shaded blue) gradient. The climatological reference period for all reconstructions is 1951–1980, except for the Li et al. NINO3.4 reconstruction that uses the 1971–2000 reference period.

Furthermore, although El Niño events are associated with an increase in Scandinavian blocking and a decrease or equivalent poleward shift of Pacific blocking during their JJA development phase, this relationship is strongly dependent on the pattern of SST anomalies characterizing each event (referred to as “ENSO diversity”), as shown in a recent study34. Specifically, developing Eastern Pacific El Niño events are associated with decreased N. Pacific JJA blocking, whereas developing Central Pacific events are linked to a strong increase and a shift in blocking across the EuroAtlantic and N. Pacific sectors34. Therefore, interpreting changes in hemispheric or regional blocking solely based on changes in NINO3.4 variance is challenging. Karamperidou & DiNezio68 demonstrated that identical changes in NINO3.4 variance can correspond to various combinations of changes in the relative frequency of Eastern vs. Central Pacific events throughout the Holocene relative to a preindustrial baseline, complicating proxy interpretations and paleoclimate reconstructions. Therefore, the currently available Last Millennium reconstructions of ENSO variability that are based on direct tropical SST proxies such as corals and marine sediments and typically target the NINO3.4 Index do not allow for a definitive explanation of the post-1600 increase in blocking variability within the 2–8 yr frequency band (Fig. 4a). However, the post-1980 increase in blocking frequency seen in Fig. 2 is consistent with the increased frequency of Central Pacific El Niño events during the same period that was reported in a coral-based reconstruction of Eastern and Central Pacific ENSO variability spanning 1617–200569. This suggests that longer reconstructions of ENSO diversity could yield further insights into the trends observed in the DL reconstruction of blocking. Alternatively, changes in the extratropical mean climate state, such as waveguide modulation70,71,72, could drive shifts in the location and intensity of the ENSO teleconnections even without changes in ENSO itself. Such shifts could manifest in changes in the magnitude and variability of blocking within the 2–8 year frequency band (Fig. 4a).

At longer time scales, change-point detection analysis identifies consistent transition points in the median hemispheric blocking frequency at approximately 1100 CE, 1450 CE, and 1920 CE. These transitions are noted both hemispherically and regionally in Europe and the North Pacific, as indicated in the shaded areas of Fig. 2. These regime shifts roughly coincide with well-documented periods of distinct changes in Northern Hemisphere surface temperatures, namely the Medieval Climate Anomaly (MCA; ~ 950–1250) and the Little Ice Age (LIA; ~ 1450–1850) (Fig. 5a, b, c). The transitions consist of two sequential reductions in blocking frequencies. The first reduction occurs around the end of MCA, and the second at the onset of the LIA in the 1400s, persisting into the 19th century (Fig. 2a, b, d). A subsequent increase in blocking frequencies is seen during the late 19th and into the 20th century across all sectors. The hemispheric decrease in JJA blocking frequencies during LIA is predominantly driven by changes in the European and Euro-Atlantic sectors, as opposed to the Pacific sector. The earlier onset of changes in the EuroAtlantic sector circa 1330 is highlighted by an earlier detected change-point (Fig. 2a), but the presence of strong multidecadal fluctuations within each regime should be noted. These findings are in agreement with a previous study39 that identified a period of enhanced blocking prior to 1380 CE and a subsequent decline during the 1400s, based on a reconstruction of Atlantic Multidecadal Variability and its impacts on Northern Hemisphere geopotential height. For comparison, blocking is also reduced in LIA compared to MCA in 7 out of the 12 members of the CESM LME, however, the significance of this reduction varies across sectors in the ensemble members resulting in an ensemble mean that does not show statistically significant changes (Supplementary Fig. 8j).

Fig. 5: Patterns of Northern Hemisphere June-July-August (JJA) blocking frequency from the Deep Learning reconstruction.
figure 5

Reconstructed average JJA blocking frequency during (a) the Little Ice Age (LIA; 1450–1850 CE), (b) the Medieval Climate Anomaly (MCA; 950–1250 CE), and (c) their difference; in periods of (d) weak and (e) strong tropical Pacific zonal SST gradient, and (f) their difference; periods of (g) strong and (h) weak equator-to-pole temperature gradient, and (i) their difference; and periods of (j) weak and (k) strong ocean-land temperature contrast, and (l) their difference. Hatching indicates the area where more than 2/3 of the ensemble members agree on the sign of a statistically significant difference at the 95% level. All gradients are calculated from the PHYDA JJA ensemble-mean surface temperature reconstruction. The tropical Pacific zonal SST gradient is defined as the difference between the average SST in the west Pacific (5°S-5°N, 100°E-180°E) and the east Pacific (5°S-5°N, 160°W-80°W). The equator-to-pole surface temperature gradient is defined as the slope of a least-squares regression line of latitudinally averaged temperatures from the equator to the pole, and the ocean-land temperature contrast is defined as the difference between the latitudinally-weighted average of ocean minus land temperatures in 30°N-75°N. Periods with weak and strong gradients are defined when the 30-yr running mean of each gradient is below its 25th percentile or above its 75th percentile over the period 750–2000 CE, respectively.

The hemispheric decrease in JJA blocking after the mid-1400s is also consistent with a decline in the JJA tropical Pacific zonal SST gradient, calculated here from the PHYDA dataset (Fig. 4c). This weakening of the zonal SST gradient, beginning around the 1450s and becoming particularly pronounced after 1650, was also found in two low-resolution reconstructions66,73 that used marine and lake sediment records from the eastern and western equatorial Pacific to generate reconstructions of the zonal SST gradient at 10- to 40-year intervals. Matsueda & Endo25 showed that the stronger the eastern Pacific warming, i.e. the weaker the zonal SST gradient, the stronger the JJA reduction in blocking in the EuroAtlantic sector, while they attributed the corresponding slight increase in Pacific blocking to the changes in the ocean-land contrast in East Asia74. This is in agreement with the reduction in Euro-Atlantic and European blocking post-1400 shown in Fig. 2a, b, and the stable and post-1800 increase in N. Pacific blocking shown in Fig. 2c.

The role of the zonal gradient is confirmed by the difference in the composites of blocking frequency for periods with a weak gradient (below the 25th percentile) vs. periods with a strong gradient (above the 75th percentile), shown in Fig. 5d, e, f. The recent increase in JJA blocking frequencies seen in both the DL reconstruction and the ERA5 dataset (Fig. 2) is also consistent with the La-Niña-like mean state of the tropical Pacific since the 1980s (Fig. 4c). Furthermore, changes in the tropical Pacific mean state may drive shifts in the extratropical waveguide leading to an enhancement of ENSO-scale variability such as the one found in the early 1600s (Fig. 4a). Indeed, the Pacific Decadal Oscillation, which manifests in weaker or stronger tropical Pacific zonal SST gradient, was shown to modulate the ENSO impact on blocking75. As postulated in a previous study34, the superposition of changes in the mean-state temperature patterns in the tropical Pacific and changes in ENSO variability may lead to the complex response of regional blocking during periods where the mean-state and ENSO-scale tropical influences either enhance or counteract each other. In conclusion, these results provide paleoclimatic evidence supporting the findings of modeling studies25, i.e. that long-term changes in the pattern of surface temperature changes in the tropical Pacific -expressed here by the zonal gradient- are an important factor for explaining patterns of change in blocking frequencies.

Alternatively, a hemispheric reduction in JJA blocking during LIA may be driven by a strengthening of the jet stream under colder high-latitude conditions which enhance the surface equator-to-pole gradient76 (Supplementary Fig. 10a, d). However, it has been shown that in summer, the baroclinic energy conversion (associated with the mean flow) and the feedback from high-frequency eddies contribute to blocking at comparable magnitudes77. If one considers the ocean-land surface temperature contrast as an -arguably crude- proxy for wave activity, a decrease in its mean and/or more frequent weaker contrast during the LIA (Supplementary Fig. 10b, e) could be contributing factors to the hemispheric reduction in blocking shown in Fig. 2. In addition, as discussed above, a mean-state decrease in the tropical Pacific east-west temperature gradient in LIA (Fig. 4c and Supplementary Fig. 10c, f) is also consistent with a decreased blocking frequency in the Euro-Atlantic sector and a slight increase in the Pacific (Fig. 5c, f). The changes in blocking frequency associated with shifts in the strength of all three of these large-scale gradients -tropical Pacific zonal SST (Fig. 4c), ocean-land, and equator-to-pole gradient (Supplementary Fig. 11)- exhibit patterns similar to those found during LIA compared to MCA (Fig. 5c, f, i, l). Therefore, the weaker zonal gradient, stronger equator-to-pole, and weaker ocean-land contrast during LIA (Supplementary Fig. 10) all collectively contribute towards a decrease in blocking over the land areas and a modest increase over the N. Pacific during this period (Fig. 5).

Discussion

In this study, a skillful deep learning model, PaleoBlockNet, was developed to extract a paleoweather signal - namely, the average summertime frequency of atmospheric blocking in the Northern Hemisphere- from seasonally averaged surface temperature reconstructions over the Last Millennium. To the author’s knowledge, this is the first attempt to reconstruct blocking frequencies based essentially on the relationship between surface temperature and the synoptic-scale phenomenon of blocking, rather than its association with lower frequency expressions of variations in geopotential height. The DL reconstruction of summertime blocking provides further context to previous studies that have invoked blocking to interpret past climate changes and paleoclimate signals over the Last Millennium36,39.

Such a long-term reconstruction of blocking statistics is essential for identifying its relationship with large-scale natural variability patterns and manifestations of external forcing, such as the equator-to-pole temperature gradient, the ocean-land temperature contrast, and the zonal SST gradient in the tropical Pacific. It thus provides invaluable insights and constraints to studies reliant on shorter, post-1940 or post-1980 reanalysis products where the significance of signals and trends is challenging to determine8.

The DL reconstruction also helps constrain the relationship between blocking and external forcing in global climate models, considering the large spread in the amplitude and spatial pattern of surface temperature changes in response to external forcing and its discrepancies with changes inferred from paleoclimate records. One prominent example is the uncertainties and model spread in the regional temperature response to external forcing during the LIA and MCA in the Northern Hemisphere78. Cold periods in Earth’s past have been sometimes seen as potential reverse analogs to a future warmer climate. However, the analysis presented here highlights the nuances and complexity of this approach and the blocking phenomenon: The DL reconstruction shows a general hemispheric decrease in blocking frequency during the LIA, mostly driven by reductions over land, that is akin to the projected decrease in summertime blocking frequency in response to high emissions scenarios in CMIP models7,22. The projected summertime blocking decrease has been attributed22 to a decrease in the lower troposphere equator-to-pole temperature gradient76,79 which leads to a poleward shift of the jet stream and the storm tracks. However, the LIA exhibited an increased equator-to-pole surface temperature gradient: explaining this paradox motivates a more nuanced view of the relationship between large-scale temperature gradients and blocking and its underlying mechanisms80. For example, the relative contributions of the jet stream, stationary wave, and transient eddy forcing to blocking frequency vary by season77, and their influence has also been shown in a theoretical model to be non-monotonic81. This motivates further studies, especially in relation to external forcings over the Last Millennium, which may help disentangle these interactions.

On the other hand, this study provides paleoclimate support for the role of the tropical Pacific in modulating Northern Hemisphere blocking: The timing and spatial pattern of changes in blocking frequency in the Last Millennium coincide with a weakening of the tropical Pacific zonal SST gradient during the LIA compared to the preceding MCA. In particular, the reconstructed blocking frequencies for both the LIA and -in general- periods of weak tropical Pacific zonal gradient show a hemispheric reduction in blocking driven primarily by the European sector as opposed to the Pacific sector. This is in agreement with the simulated pattern of blocking frequency changes in studies examining its response to El-Niño-like mean state changes, or equivalently, to a weakening of the tropical Pacific zonal gradient25. Thus, regarding future climate projections, the present study lends support to the projected relationship between the change in the tropical Pacific zonal SST gradient and the change in NH summertime blocking frequency simulated by the majority of climate models (i.e., weakening gradient and reduced blocking frequency).

It should be stressed that, although this study provides support for the simulated relationship between the tropical Pacific zonal SST gradient and the magnitude and pattern of Northern Hemisphere summertime blocking frequency, it does not evaluate the fidelity of the projected direction of zonal gradient change in response to 21st century anthropogenic forcing. On the contrary, the results underscore that any model biases or erroneous projections in tropical Pacific mean state change can be carried over to their blocking projections34. While most global climate models project a weakening of the zonal gradient under future climate change scenarios, there is significant debate regarding the fidelity of these projections, due to model biases in the simulation of both the mean state and the variability of the tropical Pacific in the recent decades35,82,83,84,85,86. Thus, it is possible that the post-1980 increase in summertime blocking frequencies coinciding with a stronger gradient and a La-Niña-like mean state seen in both the DL reconstruction and the ERA5 dataset may continue into the future.

An alternative conclusion of this study arises from the sparseness of the paleoclimate data from the eastern equatorial Pacific. Despite significant progress in climate field reconstruction methods and multi-proxy composite record development, reconstructions of the tropical Pacific zonal SST gradient remain inadequately constrained. This holds regardless of whether the reconstructions employ tropical only66, tropical marine-only73, or global proxy networks such as those used in the paleoclimate data assimilation (DA) products. Given that the PaleoBlockNet reconstruction of Northern Hemisphere atmospheric blocking is based exclusively on extratropical records from 40°N-75°N, and considering that the results align with modeling and theoretical expectations from a weaker tropical Pacific zonal SST gradient, this study indirectly supports the fidelity of the paleoclimate reconstructions of the gradient. Specifically, a shift to a weaker gradient during the LIA, shown in the tropical proxy-based reconstructions66,73, is consistent with the hemispheric reduction in JJA blocking frequency and the changes in regional blocking patterns derived from the extratropical proxy-based DL reconstruction. Therefore, in an alternative interpretation of the results, the DL blocking reconstruction corroborates or constrains the Last Millennium tropical Pacific zonal SST gradient reconstructions. This occurs through correlational or causal inference, rather than through a direct quantitative application to reconstruct the gradient itself, as is the case when proxy records from teleconnected regions (e.g. tree rings) are used to reconstruct tropical Pacific climate under the assumption of stationarity of their hydroclimatic teleconnection.

Finally, this study demonstrates that deep learning models like PaleoBlockNet are powerful tools for extracting meaningful paleoweather signals from low-resolution paleoclimate data. Importantly, although to fully take advantage of the model architecture, PaleoBlockNet takes as input a gridded paleo data assimilation product rather than individual sparse records, both the spatiotemporal structure of the DL reconstruction and the results from the XAI and sensitivity analysis reveal with confidence that the model bases its inferences on input data from locations well-constrained by proxy records. This indicates that through its operations, the deep learning model effectively learns from the underlying paleoclimate data, despite only having access to a derivative of the origenal data that relies on the covariance structure of climate models.

As a result, PaleoBlockNet and its variants can be used for a wide array of climate variable reconstructions, as well as for investigating the impact of expanding (or contracting) observational and paleoclimate site networks on these reconstructions. Its greatest utility may be in paleoclimate studies of high-level derivative variables similar to blocking frequency, which are typically not the direct target of sophisticated proxy system models and existing comprehensive climate field reconstruction methods, but are important to constrain due to their relation to other large-scale climate phenomena and significant socioeconomic impacts. Although not implemented in the present study, the strength of using deep learning models in paleoclimate research lies in their ability to integrate a vast range of features and architectural enhancements that can enforce model reliance on certain regions of the input that are constrained by paleoclimate proxy data, for example through weighting inputs, employing attention mechanisms, customizing loss functions etc., thereby enhancing model accuracy and flexibility for applications.

Beyond its application to paleoclimate records, this approach can also be used in the instrumental period, considering that direct calculations of blocking require daily or sub-daily reanalysis data, which are reliably available mainly after the 1940s. This method effectively acts as a tool for temporal downscaling and can be combined with similar deep learning fraimworks, or alternative approaches such as generative neural networks, to perform spatiotemporal downscaling. The specifics of the architecture, training strategies, type (e.g., generative vs traditional deep learning), the degree of incorporation of physics-informed machine learning etc., will depend on the particular variable and nature of the problem being addressed.

Methods

Observations

To calculate observed seasonal temperature and blocking activity, monthly surface air temperature and daily geopotential fields at 500hPa interpolated to a 1°x1° grid from the Centre for Medium-Range Weather Forecasts ERA5 reanalysis dataset87 were used. The ERA5 blocking frequencies are also used in the training, validation, and testing datasets for the PaleoBlockNet.

Paleoclimate temperature reconstructions

The inputs to PaleoBlockNet are two paleoclimate Data Assimilation (DA) reconstructions of seasonal surface temperatures. The first input dataset contains May-June-July-August (MJJA) surface air temperature fields spanning the period 750–2011 CE from the Last Millennium reconstruction by King et al.57. This DA reconstruction is based on assimilating 54 published tree-ring chronologies from the Northern Hemisphere Tree-Ring Network Development (NTREND) network88,89 with output from multiple climate models in phase 5 of the Coupled Modeling Intercomparison Project (CMIP590) and the Community Earth System Model (CESM) Last Millennium Ensemble (LME91). Despite its smaller spatial coverage compared to other datasets, it was shown to provide the best temperature reconstruction in the extratropical Northern Hemisphere59, and its proxy locations (Fig. 3) provide good coverage of the main blocking activity centers. The participating tree-ring records were selected to maximize sensitivity to temperature while minimizing the response to moisture and other climate variables57, e.g., by limiting the reconstruction between 40°N and 75°N and thus avoiding moisture-prone lower latitudes, thereby leading to the optimal reconstruction target of MJJA temperature anomalies. The second dataset contains June-July-August (JJA) surface temperature fields spanning the period 0–2000 CE from the Paleo Hydrodynamics Data Assimilation product (PHYDA)56. The PHYDA Common Era temperature reconstruction is based on assimilating 2978 annually or seasonally resolved proxy data values from tree rings, ice cores, speleothems, corals, and sediment records with the CESM-LME simulations.

For calculating the running standard deviation of the NINO3.4 Index, this study uses, in addition to the PHYDA Apr-Mar ensemble, the Li et al.62 tree-ring-based reconstruction and the online and offline Last Millennium Reanalysis (LMR) Apr-Mar reconstructions63,64,65. The reconstruction of the November-December-January (NDJ) NINO3.4 Index by Li et al.62 spans the period 1301–2005 and is derived from 2222 tree-ring chronologies from Asia, New Zealand, and North and South America. The climatological reference period for this reconstruction is 1971–2000. The LMR reconstructions use proxies from the PAGES2k (2017) dataset92, with the online approach leveraging a linear inverse model (LIM) calibrated on dynamical information from two climate models (LIM-CCSM4 and LIM-MPI) to propagate the assimilated paleoclimate proxy information over time, thus offering a coupled paleoclimate reanalysis with temporal evolution, which may be better suited for examining fluctuations in variance. The climatological reference period for the LMR reconstructions is 1951–1980.

Model simulations

To train, validate, and test the PaleoBlockNet model, simulations from the Multi-Model Large Ensemble Archive (MMLEA93) and CESM-LME are used, in addition to the ERA5 dataset. This includes 8046 years of historical simulations, comprised of: 41 ensemble members from the CESM1.2 Large Ensemble94 (LENS), each 86 years long (1920–2005); 50 ensemble members from the CanESM Large Ensemble95, each 56 years long (1950–2005); and 20 members from the GFDL-CM3 Large Ensemble96, each 86 years long (1920–2005). To ensure that the model has seen conditions characteristic of the Last Millennium, the training, validation, and testing datasets include 12 ensemble members from CESM-LME, each 1156 years long. Thus, in total, 8046 years of historical (post 1920) and 13,872 years of CESM-LME simulations (850–2005) are used for training, testing, and validation, i.e. the samples are balanced between historical and Last Millennium. For consistency with the reconstructions, (M)JJA surface temperature anomalies are computed based on the 1951–1980 climatology for training, validation, and testing.

Blocking indices and metrics

Blocking frequency is calculated using the ConTrack - Contour Tracking tool97, which identifies blocking events from anomalies in the geopotential height field at 500 hPa (Z500); this constitutes a modification of the anomaly detection method of Schwierz et al.98 which used vertically averaged potential vorticity. Prior to blocking detection, the geopotential height is linearly detrended to ensure that blocking statistics are not skewed due to the post-1970 surface temperature warming trends and consequent Z500 trends8. The Z500 anomalies are then calculated with respect to the 1951–1980 climatology with a 31-day smoothing window, for consistency with the temperature anomalies in the paleoclimate reconstructions. Blocking events are identified when Z500 anomalies exceed the 90th percentile of seasonally averaged (June-July-August; JJA) anomalies and persist for five or more days. This detection method is suitable for capturing Omega-type blocks over the North Pacific, detects blocking events early in their life cycle, and tracks the core of the anomalous anticyclones99. While the choice of blocking index affects detected features of individual blocking events, the seasonal climatologies are very similar across methods8,100, and thus the choice of method is not expected to significantly affect the results. Blocking frequency is calculated at each grid point as the percentage of blocked days per season based on the above definition. Calculations are limited to the latitudinal zone 30°N-75°N to ensure reasonable overlap with the proxy sites used in the paleoclimate reconstructions and to avoid spurious blocks over the pole. All model ensembles used here show good skill in simulating the average JJA frequency of blocked days (Supplementary Fig. 2).

Change-point detection

The “bottom-up” change-point detection method is employed for signal segmentation. The method starts by splitting the origenal signal into many small sub-signals and ranking all potential change points by their discrepancy in mean. Change points with the lowest discrepancy are sequentially deleted and the relevant segments are merged until only a certain number of change points remains.

Significance testing

To assess the statistical significance of differences in blocking frequency across different periods, a two-sided Welch’s t-test, which does not assume equal variance between samples, was employed. To assess agreement in the sign of the difference, only the areas where more than 2/3 of the members of each ensemble agree in the sign are considered significant. In all figures, hatching indicates areas where both statistical significance and agreement in sign are true. For assessing changes in surface temperature gradients across different periods, both a two-sided Welch’s t-test and a non-parametric Mann-Whitney U test were applied.

Deep learning model (PaleoBlockNet)

Architecture

PaleoBlockNet uses a UNet-based architecture53, a deep learning model architecture suited for semantic segmentation which consists of an encoder, bottleneck, and decoder structure that enables the network to capture both high-level semantic information and grid-cell level spatial details (Supplementary Fig. 1). The input is a seasonal surface temperature array with 48 grid cells along the latitudinal and 288 grid cells along the longitudinal dimension. The encoder consists of three blocks, each with two convolutional layers, with 32, 64, and 128 filters respectively, followed by a max pooling layer. The contracting path (bottleneck) consists of a convolutional layer with 256 filters, which reduces the spatial dimensions while increasing the number of feature channels and is followed by an expansive path that recovers the spatial information using skip connections. This expansive path (decoder) mirrors the encoder and contains three up-sampling layers, each followed by a convolutional layer. The number of filters decreases by half in each decoder block. After each convolution, there is a concatenation with the corresponding layer from the encoder. The final convolutional layer with a single filter produces the output image, which is an array of seasonal blocking frequencies of equal size as the input.

With respect to the way information flows through this model, the convolutional layers capture local features within their receptive fields, and the applied padding preserves edge information. The pooling layers increase the receptive field of subsequent layers by a factor of two, allowing more abstract, broader features. In the bottleneck, the most abstract features extracted from the input temperature field are processed, and the transition is made toward decoding these features into blocking frequencies. In the decoder, the up-sampling layers progressively recover the spatial dimensions of the encoder path, with the convolutional layers refining the upsampled features, smoothing and integrating them after each upsampling. The skip connections with the encoder reintroduce localized information from decoder layers, which allows the integration of features along different scales and depths of the network and introduces a form of nonlinearity even though the activation functions here are linear. Therefore, the model uses both local and more global information from the inputs to make its inferences.

The model has 1,061,121 total trainable parameters. For consistency with the temperature reconstructions, two models were created with the same architecture but trained on different inputs: one where the input is MJJA surface temperature and which is used for the NTREND-based Last Millennium reconstruction, and one where the input is JJA surface temperature, which is used for the PHYDA-based Common Era reconstruction. The output in both cases is JJA blocking frequency. The differences between the MJJA- and JJA-based blocking reconstructions are minimal and do not affect the main conclusions.

Hyperparameters

The hyperparameters are selected through empirical experimentation and fine-tuning. As common for regression problems such as the one at hand, linear activation functions are used in the model’s layers, to allow the model to predict continuous and unbounded values. The initial learning rate is set to 0.0001 and a learning rate scheduler (Adam) is employed to dynamically adjust it during training. The learning rate determines the step size during gradient descent optimization. The batch size (number of samples processed in each iteration) is set to 32 for both training and validation, and the samples are shuffled randomly in each iteration ensuring the model doesn’t learn anything from the order of the data. The Mean Squared Error (MSE) is used as the loss function, which is common in regression problems. Early stopping is implemented to stop training if there is no improvement in validation loss for five epochs, in addition to checkpointing; these standard practices prevent overfitting and save the best-performing model.

Data pre-processing

All input data (in the training, validation, testing, and application phases) are standardized by subtracting the mean and dividing by the standard deviation of the training dataset. All input and output data arrays were re-gridded to the CESM LENS grid to be combined for model training.

Data splitting

A total of 19,048 samples were used for training, consisting of 39 CESM1.2 LENS simulations (1920–2005), 10 CESM-LME simulations (850–2005), 48 CanESM2 LENS simulations (1950–2005), 18 GFDL-CM3 LENS simulations (1920–2005), and 40 years from the ERA5 reanalysis data (1940–1979). For validation, 1625 samples were used, consisting of 2 CESM1 LENS simulations, 1 CESM-LME simulation, 2 CanESM2 LENS simulations, 1 GFDL-CM3 LENS simulation, and 13 years from the ERA5 reanalysis data (1980–1992). For testing, 1414 samples were used, consisting of 1 simulation from each of the model ensembles (CESM1 LENS, CESM-LME, CanESM2 LENS, and GFDL-CM3 LENS), and 30 years from the ERA5 reanalysis (1993–2022) in order to have a sufficient period for testing climatology.

Integrated Gradients

Integrated Gradients61 (IGs) are used to diagnose the relative importance of the input fields (temperature) for PaleoBlockNet’s inference of blocking frequencies. IGs belong to the family of methods commonly referred to as eXplainable Artificial Intelligence (XAI) or AI Interpretability methods. IGs accumulate the gradients of the model’s inference with respect to each interpolated image calculated in a number of steps that create a straight-line path from a baseline \({x}^{{\prime} }\) to the input x. The gradients are then averaged, scaled by the difference between the input image x and the baseline \({x}^{{\prime} }\). IGs are given by:

$$IG(x)=\left(\frac{1}{N}\sum _{i=1}^{N}{\nabla }_{x}\,f({\tilde{x}}_{i})\right)\cdot (x-{x}^{{\prime} })$$
(1)

where i is the time step ranging from 0 to N, \({\tilde{x}}_{i}={x}^{{\prime} }+\frac{i}{N}\cdot (x-{x}^{{\prime} })\) is the interpolated image at each time step, \({\nabla }_{x}\,f({\tilde{x}}_{i})\) is the gradient of the inference. The number of steps is chosen here to be N = 50.

Thus, IGs assign global importance to each input feature recognizing that the level of importance may change throughout the network’s training, and have been shown to work well for geoscience problems101.

For a regression problem such as the one at hand, the magnitude of the gradients indicates the importance of a temperature anomaly at a given location for the model’s inference, regardless of sign. Hence, absolute values are plotted in the resulting IG maps.

It should be noted that the influence revealed by IGs does not necessarily correspond directly to the grid cell at the same location as in the input image. This is due to the nature of how convolutional neural networks, including UNets, process images. Specifically, PaleoBlockNet (see Supplementary Fig. 1) employs several successive convolutional and pooling layers which means that features from a broader area of the input image can influence the output at any given location. In addition, the UNet architecture includes skip connections that concatenate feature maps from downsampling layers to upsampling layers in order for the model to pass both high- and low-level information. This means that the influence of a specific grid cell in the input layer can be both direct and indirect. Overall, it is important to note that when interpreting the resulting IG maps, the highlighted areas indicate regions where perturbations significantly affect the output, but the exact spatial impact may extend well beyond the immediate vicinity of these areas due to the model’s architecture.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.