Introduction

One of the most robust features of model-simulated climate change in response to increasing concentrations of greenhouse gases is the amplification of warming in the upper troposphere relative to the surface in the tropics. This tropospheric warming amplification with height, which is linked to moist thermodynamic processes1,2,3,4, is largely consistent with satellite observations. Regarding the magnitude, however, most model-simulated warming trends exceed satellite observations4,5,6,7,8,9, which suggests systematic model biases and errors. Given that the vertical distribution of temperature change affects the magnitude and characteristics of the lapse rate, water vapor, and cloud feedbacks, and thus global climate, resolving the model-observation discrepancies in tropical tropospheric warming trends over the satellite era is critical for enhancing the reliability and veracity of model-projected future climate change10.

As deep convection couples the surface and the free troposphere over the tropics, sea surface temperature (SST) changes in convectively active regions are likely to play a dominant role in determining the magnitude of tropospheric warming1,3,4,11. In line with this argument, the interannual relationship between tropical-mean tropospheric temperature and corresponding SST anomalies is generally consistent between coupled model simulations and observations1,4. Moreover, the model-observation discrepancies in tropospheric warming trends decrease markedly when observed time-varying SSTs are used as a boundary condition in atmosphere-only model simulations9. Considering that the observed La Niña-like SST change pattern over the satellite era is linked, in part, to internal climate variability12,13, these characteristics suggest that the model-observation discrepancies in tropospheric warming trends are to some extent due to internal climate variability4,8,9,14,15.

Despite the contribution of internal variability, the leading cause of the model-observation discrepancies could still be deficiencies in model physics10. Yet, it was not until the beginning of the 2000s that tropospheric temperature trends from coupled model simulations substantially diverged from satellite observations4. Moreover, as long-term climate monitoring was not the primary purpose of satellite missions at the outset, part of the observational errors caused by various factors, such as sensor degradation over time, differences in sensor characteristics between satellites, and satellite orbital drift16,17,18, may still remain despite a series of bias correction and intercalibration procedures6,19,20,21. Hence, we cannot rule out the possibility that previously reported model-observation discrepancies might be overestimated due to residual observational biases1.

In this study, based on the fact that SST changes in convectively active regions largely determine the pace of tropical tropospheric warming2,3,4,11,15, we compare the temporal evolution between tropical tropospheric temperature and precipitation-weighted SST (PRSST) anomalies to examine whether the previously reported model-observation discrepancies in tropical tropospheric warming trends might be overstated due to residual biases in the satellite record along with internal climate variability. In addition, we assess the robustness of the relationship between tropical tropospheric temperature and PRSST over the satellite era by analyzing the relationship over an extended period, including the pre-satellite era. Our analysis indicates that the model-observation discrepancies over the satellite era can be partially explained by multi-decadal climate variability and residual biases in the satellite record.

Results

Contribution of internal climate variability to model-observation discrepancy

We begin by comparing satellite-observed trends for annual-mean temperature of the total troposphere (TTT22,23; Methods) over the period 1979–2014 with model-simulated ensemble-mean trends (the period was determined based on the start year of the Microwave Sounding Unit (MSU)/Advanced MSU (AMSU) records and the end year of the Coupled Model Intercomparison Project phase 6 (CMIP6)24 historical and Atmosphere Model Intercomparison Project (AMIP) experiments (which end in 2014)24) (Methods). Both satellite observations (e.g., the mean of the Remote Sensing Systems (RSS, version 4.0)19, the Center for Satellite Applications and Research (STAR, version 5.0)25, and the University of Alabama at Huntsville (UAH, version 6.0)26; Fig. 1a) and coupled model simulations (e.g., CESM2 Large Ensemble (CESM2 LE)27; Fig. 1b) indicate increases in tropospheric temperature over this time period. Coupled model simulations, however, distinctly overestimate satellite-observed warming trends, and moreover fail to capture the cooling or muted warming trends over the tropical central Pacific. Given that the model-simulated ensemble-mean trends largely represent the forced response, the model-observation discrepancies might be due to the impact of internal variability on the observational record. To examine this hypothesis more quantitatively, we computed tropical-mean (30°S–30°N) trends for the four satellite datasets and individual ensemble members of coupled models, along with associated statistical uncertainties28 (Fig. 1e). The satellite-observed trend over the period 1979–2014 ranges from 0.096 K decade−1 to 0.155 K decade−1 (0.096 ± 0.075 for STAR (version 5.0), 0.099 ± 0.070 for UAH (version 6.0), 0.149 ± 0.074 for the University of Washington (UW, version 1.0)29, and 0.155 ± 0.073 for RSS (version 4.0); red lines in Fig. 1e). Although the observational uncertainties are non-negligible, the median values in coupled model simulations are distinctly greater than the observed trends. Moreover, the satellite observations fall outside the inter-ensemble range for the CESM2 LE (100 ensemble members in total; note that the forcing is not consistent across the CESM2 LE in a strict sense15,27, as described in Methods) and IPSL-CM6A-LR (33 ensemble members in total), although some ensemble members of MIROC6 (50 members in total) were able to capture the observed trends. This suggests that the model-observation discrepancy might not be explained by internal climate variability alone, in particular, if the STAR or UAH dataset is more accurate than the other two.

Fig. 1: Model-observation discrepancy in tropospheric warming trends over the satellite era.
figure 1

a Spatial pattern of satellite-observed annual-mean tropospheric temperature (TTT) trends over the period 1979–2014 (the mean of RSS, STAR, and UAH). b Same as in a, but for the ensemble-mean trends from coupled model simulation with CESM2 (CESM2 LE). c, Same as in b, but for AMIP experiment with CESM2. d Same as in c, but for amip-piForcing experiment for a single ensemble member. e Observed and model-simulated tropical-mean (30°S–30°N) TTT trends over the period 1979–2014. Boxes cover the inter-quartile range of simulated trends for coupled (blue) and AMIP (green) experiments with CESM2, IPSL-CM6A-LR, and MIROC6, with the line inside the box representing the median value over the ensemble members of a given model and whiskers denoting the entire inter-ensemble range. Horizontal lines in red and purple denote, respectively, the observed TTT trends (RSS, STAR, UAH, and UW) and the simulated TTT trends from amip-piForcing experiment with CESM2, IPSL-CM6A-LR, and MIROC6, with associated vertical lines representing the 95% confidence intervals (taking into account temporal autocorrelation28). A list of abbreviations to the right of the panel corresponds to each horizontal line in purple or red.

Despite the fact that the imposed external forcing is identical within a given large ensemble, model-simulated trends exhibit a pronounced inter-ensemble spread in a given model, especially, the CESM2 LE and MIROC6. Therefore, based on the linkage between SST in convectively active regions and tropical tropospheric temperature2,3,4, we examine whether the spatial pattern of SST trends is substantially different between the ensemble members with larger tropical-mean TTT trends and those with smaller trends by regressing simulated SST trends against the corresponding tropical-mean TTT trends across the ensemble members. Although there are differences in the ensemble size among the CESM2 LE (100 members), IPSL-CM6A-LR (33 members), and MIROC6 (50 members), all three large ensembles exhibit a broadly similar spatial pattern over the Pacific, which bears resemblance to the spatial pattern of the Inter-decadal Pacific Oscillation (Supplementary Fig. 1). In addition, corresponding regression slopes of outgoing longwave radiation trends against tropical-mean TTT trends indicate a pronounced shift in convection centers over the tropical Pacific in association with internal climate variability (Supplementary Fig. 2). These results imply that tropical tropospheric temperature trends resulting from greenhouse gas forcing can be further enhanced or dampened due to tropical Pacific variability, in agreement with previous studies9,14.

Given that the spatial pattern seen in Supplementary Fig. 1 appears to be linked to the Inter-decadal Pacific Oscillation (IPO) or Pacific Decadal Oscillation (PDO), we further explore the potential role of eastern tropical Pacific SST variability on the model-observation discrepancies in TTT trends by comparing the pacemaker coupled model simulation over the period 1979–2014, in which the model-simulated SST was restored to the observed SST anomaly in the eastern tropical Pacific (i.e., the CMIP6 dcppC-pac-pacemaker30), with a corresponding historical experiment (Methods). The ensemble-mean TTT trends for IPSL-CM6A-LR under the dcppC-pac-pacemaker scenario exhibit weaker tropical warming compared to the corresponding historical experiment, with the least warming found over the tropical Pacific (Supplementary Fig. 3): The ensemble-mean tropical-mean TTT trend and corresponding two standard deviations across the ensemble members are 0.196 ± 0.040 K decade−1 for the pacemaker experiment and 0.290 ± 0.071 K decade−1 for the historical experiment. This result is in line with previous studies suggesting that the IPO or PDO contributed to weakened tropospheric warming in satellite observations relative to coupled model simulations4,8,9,14,15, and to other aspects of observed multi-decadal climate change31,32,33,34.

Consistent with the large difference in TTT trends between the historical and dcppC-pac-pacemaker experiments, both model-observation discrepancies and inter-model spread are markedly reduced when observed time-varying SSTs are prescribed as a boundary condition in atmosphere-only model simulations (Fig. 1c). Although the ensemble size is different between the coupled and AMIP simulations (Supplementary Table 1), the inter-ensemble spread is also substantially reduced (Fig. 1e). However, the model-simulated TTT trends are still larger than the satellite observations even when the observed time-varying SSTs were taken into account (Fig. 1e). Given potential errors in imposed external forcings in model simulations, especially regarding volcanic eruptions35 and biomass burning15, the discrepancy in TTT trends between AMIP simulations and satellite observations might have been caused by potential errors in prescribed external forcings. Hence, in the analysis, we also used atmosphere-only model simulations under the CMIP6 amip-piForcing scenario, which is the same as the AMIP but with pre-industrial external forcing. The simulated tropical-mean TTT trends (Fig. 1d) are, however, still greater than the satellite observations (purple vs. red lines in Fig. 1e), which suggests the possibility that both model and observational deficiencies contribute to the difference.

Relationship between tropical TTT and PRSST

Tropical-mean TTT trends are compared with corresponding trends for PRSST3,4 (Methods) to determine the main cause of overestimated warming in the model simulations. Due to uncertainties in the SST boundary conditions3,9 (e.g., differences between the Extended Reconstructed Sea Surface Temperature version 5 (ERSST5)36 and the Hadley Centre Sea Ice and Sea Surface Temperature (HadISST)37) and precipitation fields, as well as discrepancies in the MSU/AMSU record between satellite datasets, the ratio of the tropical-mean TTT trend over the period 1979–2022 to the corresponding PRSST trend is not well constrained in observations (red, purple, green, and orange lines in Fig. 2a): the observation-estimated ratio ranges from 0.84 to 1.56 (1.16 for RSS, 0.88 for STAR, 0.84 for UAH, and 1.09 for UW with ERSST5 and the Global Precipitation Climatology Project (GPCP)38; 1.45 for RSS, 1.10 for STAR, 1.05 for UAH, and 1.37 for UW with HadISST and GPCP; 1.25 for RSS, 0.95 for STAR, 0.91 for UAH, and 1.18 for UW with ERSST5 and the CPC Merged Analysis of Precipitation (CMAP)39; 1.56 for RSS, 1.18 for STAR, 1.13 for UAH, and 1.47 for UW with HadISST and CMAP). In comparison, the ratio of the trends is distinctly larger for model simulations; more specifically, the ratio approximately ranges from 1.5 to 1.9 for the CESM2 LE, IPSL-CM6A-LR, and MIROC6, with the median value of ~1.7 for all three models (the ensemble-mean and corresponding two standard deviations across the ensemble members are 1.68 ± 0.14 for the CESM2 LE, 1.65 ± 0.07 for IPSL-CM6A-LR, and 1.69 ± 0.17 for MIROC6). This discrepancy in the ratio of the trends between model simulations and satellite observations, along with SST observations, might additionally indicate model biases. However, the ratio of the trends is close to unity (1.0) for observations, which is contradictory to the warming amplification with height in the tropical troposphere expected from moist thermodynamic processes. Interestingly, as noted in ref. 4, the regression slope of the detrended TTT anomaly against the detrended PRSST anomaly ranges from 1.5 to 1.8 for observations, which is quantitatively more comparable to model simulations (Fig. 2b). This broad model-observation agreement on the interannual PRSST-TTT relationship thus leads to a question on the veracity of the satellite-observed TTT trends.

Fig. 2: Relations between tropical-mean (30°S–30°N), annual-mean tropospheric temperature and precipitation-weighted SST.
figure 2

a Ratio of the tropospheric temperature (TTT) trend over the period 1979–2022 to corresponding precipitation-weighted SST (PRSST) trend for the CESM2 LE, IPSL-CM6A-LR, and MIROC6. Boxes cover the inter-quartile range, with the line inside the box representing the median value over the ensemble members of a given model and whiskers denoting the entire inter-ensemble range. Horizontal lines in red, purple, green, and orange denote the corresponding ratios estimated from satellite observations of TTT (RSS, STAR, UAH, and UW) and precipitation (GPCP or CMAP) in conjunction with SST from ERSST5 or HadISST: red for GPCP and ERSST5, purple for GPCP and HadISST, green for CMAP and ERSST5, and orange for CMAP and HadISST. b Similar to a, but for the regression slope of detrended TTT against detrended PRSST over the same period.

Figure 1e implies that the discrepancies in tropical-mean TTT anomalies between AMIP simulations and satellite observations are likely to gradually increase over time in response to increasing surface warming. Comparisons of the temporal evolution of the tropical-mean TTT anomaly between the ensemble mean for MIROC6 AMIP simulations and satellite observations, however, reveal a discontinuity in the difference time series between the pre-2000 period and the post-2000 period: while the difference remained roughly zero over the pre-2000 period, it suddenly increased since the late 1990s onwards to recent years (Fig. 3a). This discontinuity is evident for all satellite datasets when compared to the ensemble mean for MIROC6 AMIP simulations, albeit with the extent varying among the satellite datasets. In contrast, the difference with other models does not show a similar discontinuity (Fig. 3a), which suggests that a spurious jump might exist in the satellite record. However, the inferred potential discontinuity in the satellite record between the pre-2000 and post-2000 periods might be an artifact resulting from biases and errors in AMIP simulations. Thus, we also examine differences between the satellite datasets in the tropical-mean TTT anomalies. As shown in Fig. 3b, the temporal evolution of the tropical-mean TTT anomaly is broadly consistent between the RSS and UW datasets. In contrast, the differences for the UAH and STAR with respect to the RSS exhibit a distinct jump between the pre-2000 period and post-2000 period, supporting the spurious jump in the satellite record inferred from comparing with AMIP simulations.

Fig. 3: Potential residual biases in the satellite-observed tropospheric temperature trends.
figure 3

a Differences between the ensemble mean for MIROC6 AMIP simulations and the corresponding ensemble mean for other models (CESM2 and IPSL-CM6A-LR) or satellite observations (RSS, STAR, UAH, and UW) in the tropical-mean (30°S–30°N), annual-mean TTT anomaly with respect to the 1979–1988 climatology. b similar to a, but for differences between the RSS and other satellite datasets.

The ratio of the model-simulated tropical-mean TTT trend to PRSST trend roughly ranges from 1.5 to 1.9 with a median value of ~1.7 for the CESM2 LE, IPSL-CM6A-LR, and MIROC6 (Fig. 2a). While the observations exhibit noticeably smaller values for the ratio of the trends, the corresponding regression slope of the detrended TTT anomaly against the detrended PRSST anomaly exhibits a broadly similar range to model simulations (Fig. 2). Hence, assuming that the tropical-mean TTT increases 1.5 to 1.9 times as fast as the PRSST, we roughly estimated the temporal evolution of the TTT anomaly relative to the 1979–1988 climatology from the observed PRSST anomaly over the period 1979–2022 and compared with the satellite-observed TTT in order to further explore this possibility that residual biases may exist in the satellite record. Supplementary Fig. 4a, b shows the temporal evolution of observed tropical-mean TTT anomalies and PRSST anomalies, respectively. Despite substantial year-to-year variability, both TTT and PRSST anomalies exhibit a distinct warming trend over the satellite era. Note also that the ECMWF Reanalysis v5 (ERA5)40 shows a similar temporal evolution as observations. Next, considering potential observational uncertainties in both TTT and PRSST, we quantified the differences between the predicted and observed TTT anomalies for various combinations of TTT and PRSST. Although there is uncertainty in the SST boundary conditions3,9, which can also affect model-satellite agreement, Supplementary Fig. 4c exhibits a potential discontinuity in time series between the pre-2000 period and the post-2000 period. Note that if the true ratio is close to unity, the difference between the predicted and observed TTT anomalies would exhibit a continuous temporal evolution, similar to that for PRSST, unlike that seen in Supplementary Fig. 4c. The discontinuity evident in Supplementary Fig. 4c thus appears to indicate potential residual biases in the observational record over the satellite era, consistent with Fig. 3.

Model-observation comparisons over the extended period

If deficiencies in the model physics were the main cause of the model-observation discrepancy in TTT trends over the satellite era, we would expect the discrepancy to further increase with an increasing time span. Hence, model-simulated trends are compared with reanalysis-produced trends over the period 1950–2022. Figure 4a shows the tropical-mean TTT anomalies for ERA5 relative to the 1950–1959 climatology, along with the corresponding ensemble-mean anomalies from coupled model simulations. Although there are some discrepancies between ERA5 and model simulations, especially in the recent decade, the ERA5 anomalies are mostly within the inter-ensemble range of the CESM2 LE. Differences in PRSST anomalies between ERA5 and model simulations are also modest except for the recent decade; in particular, the ensemble-mean PRSST anomaly for MIROC6 is quantitatively comparable to ERA5 over the whole analysis period (Fig. 4b). Using a range of ratio values (i.e., 1.5–1.9), we predicted TTT anomalies from PRSST anomalies and then compared them with simulated TTT values (Fig. 4c). For coupled model simulations, the ensemble-mean differences between the predicted with the ratio of 1.7 and simulated TTT anomalies are mostly zero over the whole period, suggesting that the assumed ratio of 1.7 is a reasonable value for model simulations. In comparison, the differences are much greater for ERA5. In ERA5, however, the difference time series does not appear to bear a resemblance to the PRSST time series, which suggests that the true value is unlikely to substantially deviate from the ratio values ranging from 1.5 to 1.9. Interestingly, a noticeable discontinuity is found in the difference time series for ERA5: while the predicted TTT anomalies distinctly underestimate the simulated TTT anomalies over the period of ~1980–2000, the differences are comparatively negligible over the pre-1970 and post-2000 periods. In contrast, although the differences are distinctly different from zero, a similar discontinuity is not present in the case of the Twentieth Century Reanalysis version 3 (20CR V3)41, which assimilated only the surface pressure and SST observations (khaki line and yellow shading in Fig. 4c). This discrepancy between ERA5 and 20CR V3 thus appears to suggest potential residual biases in the satellite record.

Fig. 4: Potential discontinuity in observed tropospheric temperature time series.
figure 4

a Temporal evolution of the tropical-mean (30°S–30°N), annual-mean tropospheric temperature (TTT) anomaly with respect to the 1950–1959 climatology for the CESM2 LE, IPSL-CM6A-LR, MIROC6, ERA5, and 20CR V3. For model simulations, lines denote the ensemble-mean anomalies with gray shading denoting two standard deviations across the ensemble members of the CESM2 LE. b Same as in a, but for precipitation-weighted SST (PRSST) anomaly. c Differences between PRSST anomaly multiplied by a range of ratio values (i.e., 1.5–1.9) and TTT anomaly for the CESM2 LE, IPSL-CM6A-LR, MIROC6, ERA5, and 20CR V3. The solid lines denote the differences with the ratio value of 1.7, with associated shading representing the entire range. Supplementary Fig. 5 presents the tropical-mean TTT and PRSST trends for ERA5 and 20CR V3, along with model simulations.

Finally, we examine the ratio of the tropical-mean TTT trend to the PRSST trend over the extended period to determine whether the weaker warming amplification in observations relative to model simulations over the satellite era is a robust feature. For coupled model simulations, there is a substantial inter-ensemble spread in trends for both TTT and PRSST; however, the ratio is approximately 1.7 for all three coupled model simulations (Supplementary Fig. 5). In addition, while most ensemble members of the CESM2 LE and IPSL-CM6A-LR exhibit greater tropospheric warming relative to ERA5 (0.151 ± 0.025 K decade−1 over 1950–2022) and 20CR V3 (0.171 ± 0.031 K decade−1 over 1950–2015), the opposite is true for MIROC6. This TTT trend distribution is thus at odds with distinct model-observation discrepancies over the satellite era. The PRSST trends for the reanalysis datasets (0.080 ± 0.017 K decade−1 for ERA5 over 1950–2022 and 0.104 ± 0.017 K decade−1 for 20CR V3 over 1950–2015) also fall within the range of model-simulated trends. Furthermore, the ratio of the tropical-mean TTT trend to the PRSST trend for the reanalysis datasets is substantially greater than unity (i.e., 1.89 for ERA5 and 1.64 for 20CR V3) and thus broadly aligns with the model simulations. Due to uncertainties in SSTs and precipitations over the pre-satellite era, along with potential discontinuity in the MSU/AMSU record, it might not be possible to accurately quantify the tropical tropospheric warming amplification relative to the SST warming. However, the similarity between reanalyses and model simulations over the extended period suggests greater tropical tropospheric warming than inferred from satellite observations.

Summary and discussion

In this study, we attempted to elucidate the causes for discrepancies in tropical tropospheric temperature trends between model simulations and satellite observations by conducting a comprehensive analysis of a series of model simulations and reanalysis datasets along with satellite observations. In agreement with previous studies emphasizing the role of internal variability in the model-observation discrepancies4,8,9,14,15, we found that multi-decadal tropical Pacific SST variability, which is closely linked to internal climate variability12,13 such as the IPO or PDO, has contributed to reduced tropospheric warming in satellite observations compared to model simulations, as demonstrated in Supplementary Fig. 3 that examines the contribution of the positive-to-negative phase shift of the IPO to the smaller tropospheric warming in satellite observations. Given that other aspects of observed multi-decadal changes are also linked, at least in part, to internal variability31,32,33,34, this result suggests that multi-decadal climate variability needs to be taken into account when assessing model performance against observations9.

The CMIP6 AMIP simulations, however, exhibited greater tropospheric warming than satellite observations, which suggests deficiencies in the model physics and/or potential residual biases in the observational record. Hence, based on the finding that SST changes in convectively active tropical regions play a dominant role in determining warming in the tropical troposphere3,4,11, we compared the TTT predicted from PRSST with corresponding satellite-observed or model-simulated TTT. Although uncertainties in SSTs and precipitation affect the accuracy of PRSST indices9,11, the comparison revealed potential residual biases in the satellite record. More specifically, the differences between the predicted and observed tropical-mean TTT time series exhibited a potential discontinuity between the pre-2000 period and the post-2000 period. Note that differences between the satellite datasets exhibit a similar discontinuity (Fig. 3b). Considering that MSU-based observations ended in 2005 (NOAA-14 satellite) while the operation of AMSU began in 1998 (NOAA-15 satellite), biases due to differences in sensor characteristics between MSU and AMSU may still remain despite various bias correction and intercalibration procedures, thus contributing to the model-observation discrepancies. In addition, while the observation-based ratio of the tropical-mean TTT trend to the PRSST trend is close to unity over the satellite era, reanalysis-estimated ratios exhibit much greater values over the period 1950–2022, which is similar to the coupled model simulations. These results, therefore, suggest that not only model deficiencies10,42 and forcing uncertainties15,35 but also residual biases in the satellite record are responsible for the model-observation discrepancies in tropospheric warming trends over the satellite era. Hence, the space-based fundamental climate data record may need to be further improved43 through recalibration and reprocessing activities for more robust and accurate climate monitoring. Given potential uncertainties in the PRSST, sustaining accurate, long-term in situ aircraft measurements is expected to help evaluate and constrain the magnitude of tropospheric warming amplification44.

In this study, we have attributed the model-observation discrepancies in the tropical-mean TTT trends over the satellite era mainly to internal climate variability and residual biases in the satellite record, apart from potential model biases and deficiencies. However, it is possible that other factors, such as aerosol concentrations, large-scale atmospheric circulation patterns, and land-use changes, might have also contributed to the discrepancies. Hence, although it is beyond the scope of the current study, an in-depth analysis of the role of those factors is likely to help further reduce the model-observation discrepancies. In addition, the broad model-reanalysis agreement on the TTT-PRSST relationship over the period 1950–2022 but large inter-model differences in PRSST trends (Supplementary Fig. 5) suggests that reducing uncertainties in model-projected tropical tropospheric warming is contingent, in part, upon improving the representation of processes controlling tropical SSTs, especially in the tropical Pacific.

Methods

Observational datasets

Brightness temperatures at microwave frequencies located in the 60-GHz oxygen absorption band, measured from the MSUs and their follow-on, AMSUs, have been used to estimate the temperatures of the mid-troposphere (TMT), lower troposphere (TLT), and lower stratosphere (TLS). The Remote Sensing Systems (RSS, version 4.0)19, the Center for Satellite Applications and Research (STAR, version 5.0)25, the University of Alabama at Huntsville (UAH, version 6.0)26, and the University of Washington (UW, version 1.0)29 independently constructed a bias-corrected, intercalibrated brightness temperature dataset over the period of the late 1970s to the present by reprocessing measurements from the MSU/AMSU sensors onboard a series of the National Oceanic and Atmospheric Administration (NOAA) operational polar-orbiting satellites. It is noted that there are substantial structural uncertainties in these satellite records (e.g., large changes between STAR version 4.1 and version 5.0 (ref. 25)). In this study, we analyzed the change and variability in the satellite-estimated temperature of the TTT over the period 1979–2022, which was created by linearly combining TMT and TLS records to remove the influence of stratospheric temperature changes22,23. As the UW dataset does not include the TLS data, the RSS TLS data were employed to remove the stratospheric contribution.

Model-observation discrepancies in tropical tropospheric warming trends over the satellite era may have resulted mainly from deficiencies in model physics. If this attribution is correct, the discrepancies are likely to become larger and more evident with an increasing time span due to increases in greenhouse gas concentrations. To assess this hypothesis, we analyzed the temporal evolution of the TTT over an extended period, 1950–2022, using the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) dataset40. In addition to ERA5, the NOAA/CIRES/DOE Twentieth Century Reanalysis version 3 (20CR V3)41 was analyzed over the period 1950–2015. Note that, unlike ERA5, 20CR V3 assimilates only the surface pressure and SST observations, excluding both radiosonde and satellite observations. To be consistent with satellite observations, reanalysis-produced temperature and humidity profiles along with accompanying surface variables were inserted into a fast radiative transfer model (i.e., RTTOV v12)45 to simulate synthetic brightness temperatures that satellites would observe under given surface and atmospheric conditions.

Model simulation output

The satellite-observed tropical TTT trends and variability and their relations with PRSSTs were compared with those for coupled model simulations. Based on the argument that the model-observation discrepancies in tropospheric warming trends over the satellite era can be attributed, at least in part, to internal climate variability4,8,9,15, the Community Earth System Model (CESM) version 2 (CESM2)46 Large Ensemble (CESM2 LE) simulations27 were used in this study, in which model simulations were repeated 100 times with different initial conditions but under the same external forcing (i.e., the historical forcing over the period 1850–2014 and the Shared Socioeconomic Pathway (SSP) forcing scenario SSP3-7.0 over the period 2015–2100 employed in Coupled Model Intercomparison Project phase 6 (CMIP6)24). It is noted that the CESM2 LE consists of two 50-member subsets, i.e., the version with normal CMIP6 forcing and the other with smoothed biomass burning emissions27, indicating that the forcing is not consistent across the CESM2 LE in a strict sense15. Given the large inter-model spread, coupled model simulation output for MIROC647 and IPSL-CM6A-LR48 was analyzed together with the CESM2 LE. We also used corresponding CMIP6 atmosphere-only model simulations integrated with prescribed historical SSTs and external forcings, i.e., Atmosphere Model Intercomparison Project (AMIP), to examine the role of observed SST changes in the model-observation discrepancies in TTT trends over the period 1979–2014. In addition, the CMIP6 amip-piForcing experiment, which is the same as the AMIP but with pre-industrial forcing, was used to explore the impact of potential uncertainties in imposed external forcings. As in reanalysis datasets, synthetic brightness temperatures that satellites would observe under given surface and atmospheric conditions were computed by inserting model-simulated fields into the fast radiative transfer model (i.e., RTTOV v12).

We also analyzed the CMIP6 Decadal Climate Prediction Project (DCPP) pacemaker coupled model simulation output for IPSL-CM6A-LR, in which the model-simulated SST was restored to an observed SST anomaly in the eastern tropical Pacific (i.e., dcppC-pac-pacemaker)30. Given that the imposed external forcing is identical between the pacemaker and corresponding historical experiments, differences between the two experiments may be largely due to observed SST variability in the eastern tropical Pacific. Supplementary Table 1 provides information on the ensemble size of the model simulations analyzed in this study.

PRSST index

As a simple index representing SST change and variability in the tropics with more weight over convectively active regions, PRSST time series were constructed following refs. 3 and 4,

$$PRSST=\frac{\overline{SST\cdot P}}{\bar{P}}$$

where SST and P denote, respectively, the annual mean SST and precipitation, and the overbar indicates the tropical (30°S–30°N) average. For observation-based PRSST indices, the Global Precipitation Climatology Project (GPCP) (version 2.3)38 and CPC Merged Analysis for Precipitation (CMAP) (version V2404)39 precipitation data were used in conjunction with the SST from either NOAA’s ERSST536 or the UK Met Office HadISST dataset37. In estimating observation-based PRSST indices, GPCP and CMAP precipitation data were interpolated to ERSST5 and HadISST grids. For reanalysis datasets and model simulations, reanalysis-produced and model-simulated fields were used to compute the PRSST indices.