1 Introduction

In the context of global warming, detection and attribution of climate change have become an important issue for better adaptation and mitigation strategies. Paleoclimatology, which can provide a background of the natural variability, has received much attention in the past decades (e.g., Masson-Delmotte et al. 2013 and references therein). Especially on large time scales (e.g., decadal, multi-decadal, centennial, etc.), reconstructions/simulations have been used as important references for better understanding the current and projected climate changes (Cook et al. 2010; Neukom et al. 2019). Tree-ring width (TRW) data is one of the most used proxy to reconstruct annually resolved past climate. It has been used for temperature, precipitation, drought, streamflow, etc., reconstructions (e.g., Cook et al. 1999; Esper et al. 2002; Cook et al. 2013; St. George and Ault 2014; Shi et al. 2017; Liu et al. 2017; Harley et al. 2017; Pearl et al. 2020;Ljungqvist et al. 2020), and important historical events such as the severe drought in the American southwest in the late 1200s, have been revealed.

One key challenge using TRW is that they include non-climatic signals (Franke et al. 2013; Christiansen and Ljungqvist 2017) that need to be filtered out before chronologies and reconstructions can be properly determined. After the TRW data is measured and crossdated, a very important procedure is to detrend or standardize the tree-ring widths, such as removing the age-related trends in the radial growth rate (Sheppard 2010), or the potential low frequency variations caused by the stand dynamics and competitions in closed canopy forests (Cook and Peters 1981), etc. Various detrending methods such as the Curve Fitting Standardization (CFS) methods (i.e., fitting the tree growth trend using various models such as the negative exponential curve, the smoothing spline curve (Cook and Peters 1981), the ensemble empirical mode decomposition curve (Zhang and Chen 2017), etc.), the Regional Curve Standardization (RCS) methods (Briffa et al. 1992; Esper et al. 2002), environmental curve standardization method (Helama et al. 2005), etc., have been proposed, and the detrending methods are continuously improved up to present (Shi et al. 2020). When detrending/standardizing the tree-ring widths, a noteworthy procedure is the use of the first-order auto-regressive (AR1) or autoregressive moving average model (ARMA) as pre-whitening methods to remove the short-term persistence in tree-ring widths due to the biological carry-over effect of trees (Cook 1985). All these efforts have pushed the development of dendroclimatology, which has been playing a substantial role in the studies of climate change.

However, recent studies have indicated that the TRW chronologies/reconstructions still suffer from overestimated persistence that may arise from various environmental processes (e.g., the integrative behavior of soil moisture) (Bunde et al. 2013; Franke et al. 2013; Büntgen et al. 2015). For example, the tree-ring based precipitation reconstructions from North America and Central Europe have been found to have much stronger persistence than those in observations and dynamical model simulations, in which case unexpected more (less) rainfall is reconstructed after wet (dry) episodes (Bunde et al. 2013), leading to prolonged flood (drought) events that may be artificial (Zhang et al. 2015) (see also Fig. 1 in this study). Recent studies further suggest that the discrepancy is due to the fact that the TRW proxies normally reflect soil moisture or runoff, etc., not rainfall directly (PAGES Hydro2k Consortium 2017). Temperature reconstructions based on TRW records have similar problems (Zhang et al. 2015), and the consequent intensive low-frequency variability may be even comparable with the recent warming trend (Ludescher et al. 2020). After strong tropical volcanic eruptions, TRW data show suppressed post-volcanic cooling effects (Esper et al. 2015), and the discrepancies of the volcanic eruption effects between the model simulations and the TRW reconstructions can be resolved by artificially enhancing the persistence of the model simulations (Lücke et al. 2019). These issues point to the important “known unknown” in TRW studies, that is how to properly remove the non-climatic influences, and isolate the desired climatic signals (e.g., PAGES Hydro2k Consortium 2017 and references therein).

Fig. 1
figure 1

Comparison of time series with weak (a) and strong (b) long-term persistence (LTP). The gray bars in a represent the reconstructed precipitation records at a southeast grid. From the DFA analysis, this time series has weak temporal persistence with \(\alpha\) = 0.55. By applying Fourier filtering technique (Turcotte 1997), artificial data with strong LTP (\(\alpha\)=0.80) is generated (the green bars in b). From the 31-years moving averages (the thicker black and green curves in (a) and (b)), apparent differences between the two time series can be found, see the red and blue shaded areas. If the LTP is exaggerated, one may overestimate the historical long-term wet/drought events

One potential reason for the overestimated persistence in TRW based reconstructions may be explained with TRW chronologies that have redundant serially Long-Term Persistence (LTP). Different from the well-known short-term persistence that decays with time exponentially and can be simulated using AR or ARMA models, LTP describes scaling behaviors that may lead to persistence on much longer time scales ranging from a few months to several decades (e.g., Franzke et al. 2020 and references therein). In recent years, with the development of several advanced approaches such as wavelet analysis (Arneodo et al. 1995; Abry and Veitch 1998), detrended fluctuation analysis (Peng et al. 1994; Kantelhardt et al. 2001), structure function method (Lovejoy and Schertzer 2012) etc., LTP has been found ubiquitous in the climate system. Many variables including temperature, precipitation, relative humidity, atmospheric circulation, soil moisture, etc. have been found to have LTP on local (e.g., in-situ records) and large (e.g., hemispheric, global) scales (Chen et al. 2007; Vyushin and Kushner 2009; Wang et al. 2010; Jiang et al. 2017; Fredriksen and Rypdal 2017). LTP can induce large-scale variabilities and the stronger the LTP is, the more prominent large-scale variations (or in other words, low-frequency variability) will be (Lennartz and Bunde 2009; Zhu et al. 2010). Considering the fact that the TRW proxies may be affected by various environmental factors (such as soil moisture) that have LTP, the TRW based reconstructions indeed have a risk of overestimating the LTP. Therefore, the LTP is a useful test bed to evaluate paleoclimatological reconstructions (Bunde et al. 2013; Zhang et al. 2015). An overestimated LTP in the reconstructions may lead to enhanced low-frequency variability and further unrealistic judgements of long-term climate anomalous events (Fig. 1). When reconstructing the past climate using TRW data, besides addressing the “short-term” persistence by, e.g., AR model (Cook et al. 1999), novel approaches are thus required to deal with the persistence on longer time scales.

In this study, the Fractional Integral Statistical Model (FISM) (Yuan et al. 2014) is applied to investigate the potentially overestimated LTP and the corresponding low-frequency variability in TRW based hydroclimatic reconstructions. FISM is a generalized version of the stochastic climate model (SCM). Ever since the classical literature by Hasselmann (1976), it has been suggested that the slow varying processes in the climate system can be regarded as accumulative responses to continual excitations by short-term disturbances. This relation can be described in terms of fractional integral (Yuan et al. 2013), and FISM is thus able to simulate and quantify the long-term persistence associated signals. As discussed in (Yuan et al. 2014), a given time series with LTP can be divided by FISM into two parts,

$$\begin{aligned} x(t)=M(t)+\varepsilon (t), \end{aligned}$$
(1)

where M(t) stands for the long-term influences accumulated from the past (hereafter, we name it as the LTP signals), and \(\varepsilon (t)\) the short-term “forcing” signals. Suppose the year to year variations of the tree-ring widths along with the pre-whitening methods (e.g., AR model) can capture the high frequency interannual climate variability, \(\varepsilon (t)\) extracted from the TRW based reconstructions thus can be considered as reliable short-term “forcing” signals that have long-lasting influences. In this case, the overestimated LTP may be corrected by adjusting the LTP signals M(t).

Here we analyze a recently released dataset of precipitation reconstructions over China covering the past half millennium (Shi et al. 2017). In this dataset, the precipitation field for the whole of China was reconstructed using the optimal information extraction (OIE) method (Shi et al. 2012) under the point-to-point regression-based (PPR-based) framework (Cook et al. 1999). Three types of proxies including (i) 371 TRW chronologies, (ii) 107 dryness/wetness index (DWI, derived from historical documents), and (iii) 1 tree ring oxygen isotope chronologies are used for the reconstruction. The TRW chronologies are mainly located in western China, while the DWI records mostly in eastern China (see Fig. 1 in Shi et al. 2017). The precipitation reconstructions at grid point scale were reconstructed mainly using multiple types of proxy records, but in the southeastern region where the DWI records cover, the maximum distances from the predictors (proxy records) to the target grid points are normally below 400km (see Fig. 2 in Shi et al. 2017). Taking 400km as a threshold, the reconstructions at many grid points in southeastern China are found to solely rely on the DWI records (Fig. S1). Considering that the local chronicles reliably recording and in detail the past dry/wet conditions (Zhang 1983, 1988), results at these grid points may be used to compare with those from other locations where the TRW chronologies have contributed to the reconstructions. Accordingly, this dataset is suitable to investigate whether the TRW related reconstructions (mainly in western China) overestimate LTPs, and how to correct them. In this study, we apply the detrended fluctuation analysis to measure the LTPs. Combining with FISM, a correction approach is proposed to remove the redundant LTP.

Fig. 2
figure 2

Geographical distributions of the LTP exponent \(\alpha\) calculated from reconstructions (a) and the instrumental (b). c Shows their differences (the \(\alpha\) values from reconstructions minus the \(\alpha\) values from the instrumental records). From the reconstructions, strong LTPs are found to be mainly in the western China, where the reconstructions are mainly obtained from the tree-ring width records (see the green points in c, note the green points with small black dots inside represent the TRW chronologies with length longer than 400 years). Using instrumental data, however, the LTPs are found to be much weaker and the estimated \(\alpha\) values are lower than 0.65 in most regions. The hatching lines indicate the regions with \(\alpha>\) 0.55 (suggesting the existences of LTP) but smaller than 0.59. The regions with \(\alpha>\) 0.59 are marked by dots. By calculating the differences between the results from reconstructions and those from instrumental data, significant differences at 95% confidence level are found in western China (see the hatching areas in c). The bounds of the 95% confidence intervals are estimated as follows: for each grid point in b, we first generated one long (\(L = 5,700,000\)) artificial data with the \(\alpha\) value the same as the observed \(\alpha\) at this grid point using Fourier filtering technique (Turcotte 1997). After dividing the long artificial data into 10,000 short data of length \(l = 530\), we applied DFA2 to these short data to determine the 95% confidence interval from the 10,000 alpha values

The paper is organized as follows. In Sect. 2, we introduce the data and the methods. The LTP properties in the precipitation reconstructions are measured and compared to those in the instrumental data in Sect. 3. In Sect. 4, the correction approach is illustrated. By applying the methodology to the precipitation reconstructions over China, the results are presented and discussed in Sect. 5. In Sect. 6, we provide conclusions and ways forward.

2 Data and methods

2.1 Data

In this study, we analyze gridded warm season (May–September) precipitation reconstructions (anomalies from 1961 to 1990) over China from 1471 to 2000 (Shi et al. 2017). The dataset has a spatial resolution of 0.5° ×0.5°, and covers the region from 72°E to 136°E and 18°N to 54°N. They are downloaded from the National Oceanic and Atmospheric Administration (https://www.ncdc.noaa.gov/paleo-search/study

/23056). Besides the paleo-reconstructions, gridded monthly instrumental precipitation data over China are also used (downloaded from the China Meteorological Data Service Center, http://data.cma.cn/data/cdcdetail/dataCode/SURF_CLI_CHN_PRE_MON_GRID_0.5.html. It has the same spatial resolution as the reconstruction data, and the temporal coverage used in this study is from 1961 to 2004 (528 data points for each time series, similar to the reconstructions). Before analysis, seasonal trends are removed by subtracting annual cycles from the observed data (Koscielny-Bunde et al. 1998), as \(x(t) = \tau (t)-<\tau (t)>, t=1,\ldots ,N\), where \(\tau (t)\) is the instrumental data, \(<\tau (t)>\) is the long-time climatological average for each calendar date, and N is the data length.

In Shi et al. (2017), 371 TRW chronologies mainly from western China were used to reconstruct the precipitation field. To better show that the redundant LTPs in the reconstructions are mainly associated with the TRW chronologies, we also analyze the LTP properties in the TRW chronologies. Since a reliable detrended fluctuation analysis normally requires the length of the considered time series to be no less than 400 (Ludescher et al. 2020), we measure the LTPs of 140 TRW chronologies with data length longer than 400 years (see Fig. 2 for their geographical locations).

It is worth noting that besides the precipitation reconstructions over China, we also briefly discussed the different persistence between a recently proposed long-term TRW based precipitation reconstruction from Greece (the high-elevation Pinus heldreichii (HEPI) reconstruction) (Esper et al. 2021) and its corresponding reference extracted from CRU TS 4.04 dataset (Harris et al. 2014) (http://badc.nerc.ac.uk/data/cru/). The HEPI reconstruction is purely TRW based, and the CRU data is longer than 100 years (1901–2016). A brief comparison between them can provide us with more insights of the overestimated persistence by TRW.

2.2 Methods

2.2.1 Detrended fluctuation analysis

To detect and measure the LTP in the reconstructions, the instrumental data, as well as the tree-ring chronologies, a straightforward way is to calculate the auto-correlations. As long as the auto-correlation C(s) of a given time series, e.g., \({x(i)},i = 1,\ldots ,N\), decays with the increase of time scale s as a power law, \(C(s)\sim s^{-\gamma }\), \(s>0\) (\(0<\gamma <1\)), one can confirm the existence of LTP and describe the LTP strength using the parameter \(\gamma\) (Kantelhardt et al. 2001). However, the strong uncertainties in the calculation of C(s) often hinder a reliable determination of the LTP (Kantelhardt et al. 2001). In practice few works investigate the LTP using auto-correlations directly. By studying how the fluctuations change with the increase of time scale s, the detrended fluctuation analysis (DFA) (Peng et al. 1994; Kantelhardt et al. 2001) has become a widely used method for the detection of LTP. It can give a more robust estimation of the LTP even if the time series of interest is nonstationary (e.g., with polynomial trends, varying local variances, etc. See Fig. S2 and Hu et al. 2001, Chen et al. 2002). Accordingly, the DFA is an appropriate method for the analysis of climate time series.

In this study, we employ the DFA of the second order (DFA2) to detect the LTP (Kantelhardt et al. 2001). In DFA2, one considers the cumulated sum \(Y(k)=\sum ^{k}_{i=1}\{x(i)-<x>\}\). By dividing Y(k) into non-overlapping windows of size s, the variance of Y(k) around the best polynomial fit of the second order in each window j can be calculated as \(F^2(s,j)\). By averaging \(F^2(s,j)\) over all the windows and taking the square root, the desired fluctuation function F(s) on time scale s is obtained. For time series with LTP, F(s) will increase with s as a power law, \(F(s)\sim s^{\alpha }\). Theoretically, the exponent \(\alpha\) has been proved to have a simple relationship with the parameter \(\gamma\) in the auto-correlation analysis as \(\alpha =1-\gamma /2\) (Kantelhardt et al. 2001). Accordingly, an \(\alpha\) larger than 0.5 indicates the existence of LTP, and the bigger \(\alpha\) is, the stronger the LTP will be. For the cases with \(\alpha\) equals 0.5, the time series of interest is considered as white noise with no persistence.

It is worth noting that LTP in the climate system can range from months to multiple decades (Lovejoy 2015; Franzke et al. 2020), and may even to centennial scale (Fraedrich et al. 2009; Ludescher et al. 2020). In practice, however, one can only measure the exponent \(\alpha\) over a scaling range that depends on the data length. Suppose precipitation has a universal scaling behavior that spans from months to centennial scale, in this study we will use the LTP exponent \(\alpha\) obtained from the instrumental records as references to adjust the LTP in the reconstructions. Since the reconstructions represent the precipitations from the warm season (May–September), the best reference would be obtained from the instrumental records of the same period each year. However, using only the warm season records makes the time series too short for a reliable DFA2 analysis (which requires a minimum data length of 400), we thus study the complete monthly records in this work. As discussed in (Ludescher et al. 2020), the LTP exponents obtained in this way may be slightly higher than the case when only the records from the warm season are analyzed. Accordingly, we consider the LTP exponents obtained from the complete instrumental records as the upper bound references for the adjustment of the LTP in the reconstructions. In addition, it is also worth noting that the LTP exponent obtained from DFA is independent of the data temporal resolution. As shown in Fig. S3, over the same scaling range the slope of F(s) versus s in a log-log plot does not change with the data resolution. Considering that we do not have long enough instrumental annual records (i.e., longer than 400 years) to support reliable DFA analyzes, monthly records are thus used here as an alternative solution. In view of the changing uncertainties of the DFA analysis due to different data lengths (Fig. S4), we study the monthly data from 1961–2004, which has 528 data points for each time series, similar to the data lengths of the reconstructions.

2.2.2 Fractional integral statistical model

As discussed above, assuming that the year-to-year changes of the TRW can well capture the high frequency variations of the associated climate variable (i.e., here refers to precipitation) on interannual time scale, the challenge is how to properly reconstruct the long-lasting impacts, as various environmental processes may confound the desired climatic signal. In this study, we employ the Fractional Integral Statistical Model (FISM) to address this challenge. In FISM, the long-lasting impacts of short-term forcing signals can be successfully simulated in terms of fractional integral, as shown below (Yuan et al. 2013, 2014),

$$\begin{aligned} x(t) = \frac{1}{\Gamma (q)} \int _{u=0}^{t-\delta } \frac{\varepsilon (u)}{(t-u)^{1-q}} \mathrm{d}u + \varepsilon (t), \end{aligned}$$
(2)

where \(\varepsilon (u)\) represents the historical dynamical and thermodynamical short-term forcings, \(t-u\) denotes the distance between historical time point u and the present time t, \(\delta\) is the sampling time interval, \(\Gamma\) is the gamma function, and q is integral order. The first term to the right of the equal sign simulates the accumulated historical impacts, which corresponds to the LTP signals (M(t) in Eq. 1) (Yuan et al. 2013, 2014),

$$\begin{aligned} M(t) = \frac{1}{\Gamma (q)} \int _{u=0}^{t-\delta } \frac{\varepsilon (u)}{(t-u)^{1-q}} \mathrm{d}u. \end{aligned}$$
(3)

Accordingly, if the historical short-term forcings \(\varepsilon (u)\) and the integral order q are known, one can estimate the LTP signals M(t) using Eq. (3). It has been proved that the integral order q is linearly related to the DFA exponent \(\alpha\) as \(q=\alpha -0.5\) (Yuan et al. 2013, 2014), thus q can be easily calculated from the DFA analysis. Regarding the short-term forcings, by reversely deriving Eq. (2) one can extract \(\varepsilon (u)\) iteratively as long as the historical time series x(t) and the integral order q are known (Yuan et al. 2014). In this way, the LTP signals M(t) can be estimated and x(t) can be decomposed into the two parts as shown in Eq. (1). It is worth noting that there is another well-known model, the autoregressive fractionally integrated moving average (ARFIMA) model, that can simulate the LTP in a given time series. FISM and ARFIMA are closely related as they are both designed from fractional integral techniques. Compared to ARFIMA, however, the FISM model is more suitable for decomposing the considered time series x(t) into the “memory” and the short-term “forcing” part, which can further contribute to a better understanding of the physical processes of how LTP arises. Therefore, we employ the FISM model in this work. For more detailed comparisons between these two models, please refer to (Yuan et al. 2014).

3 LTP in the reconstructions, instrumental data, and the TRW chronologies

We first applied DFA2 to the precipitation reconstructions of Shi et al. (2017). As shown in Fig. 2a, the DFA exponent \(\alpha\) values are all larger than 0.50, indicating the existence of LTP. Larger \(\alpha\) values (> 0.75) are found in western China, while in the eastern part, the \(\alpha\) values are much smaller ranging from 0.50 to 0.65. This difference between the western and eastern China, however, is not found in the DFA results of the instrumental precipitation records (Fig. 2b). The weak LTP in the instrumental records is in line with many previous studies, that show low \(\alpha\) values from in-situ precipitation observations on time scales larger than months (e.g., Bunde et al. 2013; Jiang et al. 2017). Using Monte-Carlo tests, only some regions are found to have statistically significant LTP (e.g., see the dotted areas in Fig. 2b). Figure 2c shows the differences of the \(\alpha\) values between the reconstructions and the instrumental data. Statistically significant overestimations of the LTPs are mainly found in the whole western part, as well as some regions of the eastern part (see the hatching areas in Fig. 2c). In these regions, the \(\alpha\) values in the reconstructions are 0.20–0.30 higher than those from the instrumental data. In view of the close relations between the LTP and the low-frequency variability, these redundant LTPs may thus lead to unrealistic prolonged and intensified dry/wet events, as suggested in Fig. 1, where the effects of redundant LTPs (i.e., increasing \(\alpha\) by 0.25) are simulated.

One main reason for the overestimated LTPs is related to the TRW proxy. In Fig. 2c, the locations of the 371 TRW chronologies (the green points) used in the reconstruction of Shi et al. (2017) are shown. It is obvious that the regions with significantly overestimated \(\alpha\) values correspond very well with the regions where the reconstructions highly rely on the TRW chronologies. In eastern regions where DWI records mainly give skill to the reconstructions, however, the deviation of \(\alpha\) values are not consistent and in many regions, not significant. Particularly, in the southeast of China where the reconstructions are purely calculated from the DWI records (see the red box in Fig. S1), the LTPs are not significantly different from those from the instrumental data. Since the TRW chronologies may have redundant serially persistence due to various environmental processes (e.g. Bunde et al. 2013; Büntgen et al. 2015;Lücke et al. 2019), they largely contribute to the overestimated LTP in the precipitation reconstructions. As shown in Fig. 3, we applied DFA2 to the tree-ring chronologies. The \(\alpha\) values are found to be remarkably larger than 0.50, and centered around 0.80–0.90 (see the red bars). In extreme cases, the \(\alpha\) values can even be as high as 1.00. These values are much higher than those from the instrumental precipitation records, where the \(\alpha\) values are centered around 0.50 (see the blue bars in Fig. 3).

Fig. 3
figure 3

The histogram of LTP exponent \(\alpha\) calculated from TRW chronologies (red bars) and instrumental precipitation records (blue bars). 140 TRW chronologies with data length longer than 400 years are analyzed. Their geographical locations are shown in Fig. 2 (the green points with small black dots inside). The \(\alpha\) values obtained from the instrumental precipitation records are weak (around 0.5), while the \(\alpha\) values calculated from the tree ring chronologies are remarkably stronger

It is worth noting that here we compared the LTPs from 44 years long instrumental monthly data with those from annual reconstructions/TRW chronologies. An ideal comparison should be between annual instrumental data and annual reconstructions of the same length. Unfortunately, the fact is even if the longest instrumental data is used, from the annual data the LTPs still cannot be reliably estimated, as a reliable DFA calculation requires a data length of at least 400 (Ludescher et al. 2020). To show the different persistence between TRW based reconstructions and observations of the same length and temporal resolution (i.e., annual), we alternatively calculated the mean precipitation values after n consecutive dry/wet years (\(n=1,2,\ldots\)). As shown in Fig. S5, using a recently published TRW based precipitation reconstruction data (from 1901 to 2016) in the Pindus Mountains of northwestern Greece (Esper et al. 2021) and the corresponding long-term annual records extracted from CRU TS 4.04 dataset (covering grid points between 39°–40°N, 21°−24°E, from 1901 to 2016), one can see clearly that the TRW based precipitation reconstruction tends to be higher (lower) after n consecutive wet (dry) years. Apparently, this indicates a stronger persistence in the TRW based reconstructions, which is in line with the results of the DFA if the monthly CRU data (from 1901 to 2019) and the full reconstructions (from 730 to 2016) are analyzed (Fig. S5b). Hence, TRW chronologies indeed may introduce redundant persistence into the reconstructions. To reconstruct precipitations using TRW chronologies, a proper removal of the strong LTP is thus particularly important. The currently widely used pre-whitening methods, such as the AR model which only takes into consideration the persistent effects of limited short time length, are not sufficient. The new approach for the correction of the redundant LTPs will be introduced in the Sect. 4.

4 A new correction approach

To correct the overestimated persistence, we employ the Fractional Integral Statistical Model (FISM) in this study. With FISM, one is able to quantify the long-term accumulated historical impacts (Eq. 3) and decompose the time series of interest into the LTP signals M(t) and the short-term forcing signals \(\varepsilon (t)\) (Eq. 1). Since the year to year TRW changes contain signals of interannual climate variability, which may be further recorded in the short-term forcing signals \(\varepsilon (t)\), to reconstruct the persistence/variability on longer time scales, one main issue is how to capture the long-lasting influences of these short-term forcings, or in other words, how to properly calculate the LTP signals.

Fig. 4
figure 4

Sketch of the steps for correcting the precipitation reconstructions. The first step is to estimate the LTP strengths in the reconstructions (the gray and black curves) using DFA. With the estimated DFA exponent \(\alpha _{rec}\) (e.g., the \(\alpha\) value shown in black color), one can further remove the memory part using FISM in the second step, and extract the residual short-term forcing part (the red and blue bars). In the last step, new reconstructions (the green curves) can be obtained by integrating the short-term forcing part to a proper order. The integration order can be determined from the \(\alpha _{ins}\) values of the instrumental precipitation records (e.g., the \(\alpha\) value shown in green color)

Figure 4 shows the steps of how to correct the overestimated persistence. For a given reconstruction sequence of interest, the LTP strengths in both the reconstructions and the corresponding instrumental data are first calculated by measuring the DFA exponent \(\alpha\) (\(\alpha _{rec}\) and \(\alpha _{ins}\)). For the cases when the two \(\alpha\) values are different, we go further to the second step to extract the short-term forcing signals \(\varepsilon (t)\) from the reconstructions using FISM (see the red and blue bars in Fig. 4). As discussed above, since \(\varepsilon (t)\) is considered as a kind of representation of the interannual (fast) climate changes, to correct the long-term persistence one only needs to feed \(\varepsilon (t)\) into the FISM model (Eq. 2) again but do the fractional integration with a proper order (see the third step in Fig. 4). For instance, if we take \(\alpha _{ins}\) from the instrumental records as a reference, a corrected reconstruction can be obtained (the green curves) by setting the fractional integral order q as \(q=\alpha _{ins}-0.5\). In this way, the reconstructions can both hold the interannual (short-term “fast”) variabilities and at the same time, show the same LTP as the instrumental data.

5 Correction results and discussion

Using this new approach, we corrected the persistence in the precipitation reconstructions. In Sect. 3, we find that LTPs in the historical reconstructions are stronger than those from the instrumental precipitation records over many regions of the country, especially in western China (see Fig. 2c). Taking the LTPs in the instrumental data as the references, the corrected reconstructions are found to have nearly the same LTPs as the observations. As shown in Fig. 5, our target is to remove the redundant LTP originated from the TRW chronologies, but the correction approach is applied to all the reconstructions (including the areas where the reconstructions are purely determined by DWI records). After the correction, the spatial distribution of the DFA exponent \(\alpha\) from the corrected precipitation reconstructions is highly consistent with that from the instrumental precipitation records (Fig. 5b, please see Fig. S6 for the differences of the DFA exponents between the original and the corrected reconstructions). This result is reasonable as we corrected the reconstructions using the same fractional integral orders q as the instrumental records (see the third step in the new approach). However, besides the corrected \(\alpha\) values, more important improvements are in the temporal variations of the reconstructions. For example, considering the spatial average over western China, after applying the approach, the differences before and after the correction are shown in Fig. 6. Apparently, the original reconstruction averaged in western China has strong LTP, which induces strong low-frequency variability on decadal to centennial scale. As the gray bars and the blue shaded areas shown in Fig. 6, several historical long-term drought events (e.g., 1500s–1570s, 1590s–1750s, etc.) are found in the original reconstructions, and more than that the dry conditions last for a long time from around the sixteenth to the nineteenth century. These findings, however, may be not realistic due to the overestimated LTP in western China. After addressing the LTP, the corrected precipitation reconstruction fluctuates on a much shorter time scale, and the duration of drought events is reduced significantly to a few years/decades (similar results are also found in the TRW chronologies, not shown). Moveover, the magnitude of the drought events also weakens remarkably by about 90% (Fig. 6, shown by the green bars and the orange curves). Assuming the LTP in precipitation remain unchanged during the past few hundred years, the corrected precipitation reconstruction indicates no centennial scale long-term severe drought events in western China during the past several centuries (Fig. 6). Compared to the original reconstructions of Shi et al. (2017) where a persistent long-term dry condition was found from the sixteenth to the nineteenth century, the corrected reconstructions are more realistic.

Fig. 5
figure 5

Geographical distribution of the LTP exponents that are calculated from the corrected precipitation reconstructions (a). Compared with Fig. 2b, one can see that the \(\alpha\) values from the corrected precipitation reconstructions have a similar spatial pattern as those obtained from the instrumental precipitation records. Their differences (the \(\alpha\) values from corrected reconstructions minus the \(\alpha\) values from the instrumental records), which may mainly arise from the uncertainties of the calculation, are not statistically significant at nearly all the grid points (Fig. 5b). There are only two very small regions/points with significant differences (see the black short lines in b).

It is worth noting that the correction of the overestimated LTP relies much on the selection of the reference, in which reliable LTP estimation is required. In this work, we used the instrumental monthly records as the reference. The main reason for not using instrumental annual records is that the current instrumental annual records are not long enough to support a statistically solid calculation of the LTP. As reported in (Ludescher et al. 2020), a reliable DFA analysis requires a data length of at least 400, which means 400 years instrumental annual records are needed for a reliable DFA calculation. Moreover, the insensitivity of the DFA results to different data temporal resolutions also makes the monthly records good substitutes in the analysis (Fig. S3). To correct the overestimated LTP in the TRW related reconstructions, we also assumed that precipitation has a universal scaling behavior that ranges from monthly to centennial scale. This assumption follows the principle of “uniformitarianism”, which is also commonly assumed in paleoclimatology studies. In fact, from the similar DFA exponents between the instrumental records and the purely DWI based reconstructions (see the southeast of China in Fig. 2c and Fig. S1), we can tell that this assumption seems to be reasonable. Since the local chronicles may reliably record the past dry/wet conditions, the similar LTPs between the DWI based reconstructions and the instrumental records suggest that the scaling behavior in precipitation indeed has a chance to cover several scales from months to centuries. In other words, the LTPs measured from monthly to decadal scales (e.g., results from the instrumental data) may be the same as the LTPs measured from decadal to centennial scales (e.g., results from the precipitation reconstructions). Of course, it should be noted that we cannot completely exclude the possibility that the scaling behavior in precipitation might change over time (Markonis and Koutsoyiannis 2016). Accordingly, the use of the instrumental records as references is only one possible way to correct the overestimated persistence. Another possibility is the use of paleo-model simulations or other independent (not used for the reconstruction under investigation) proxies that cover a comparable time scale as the reconstructions. If reliable LTPs are confirmed in these potential references, the approach can be applied to perform a reasonable correction.

One feature of this new correction approach is that it is not just a series of mathematical calculations, but also has a physical basis behind. Using this approach, the high frequency climate variability recorded in the year-to-year changes of tree-ring widths are first extracted in terms of \(\varepsilon (t)\). By making a proper fractional integration on these high frequency changes, their long-lasting impacts are further simulated in terms of LTP signals. The resultant time series thus consist of both the short-term “fast” changes and the long-term “slow” persistence. Compared to the concise method based on the Fourier filtering technique (Ludescher et al. 2020), the correction approach proposed in this work shows more clearly how the LTP signals are accumulated from the past climates. Accordingly, if more details about the underlying processes are required, it is more appropriate to use this new approach.

6 Conclusion

Due to various environmental processes (e.g., the integrative behavior of soil moisture), the TRW based reconstructions have been well recognized to have overestimated long-term persistence (LTP), which may further induce unrealistic low frequency variability. In this study, a new approach to correct the overestimated persistence is proposed. Assuming the year-to-year changes of the tree-ring widths can well capture the high frequency interannual climate variability, here we focus on the correction of the accumulated long-lasting historical impacts from the high frequency changes. In this approach, one first diagnoses the LTP in the considered reconstruction to see if it is overestimated. When yes, the reconstruction is decomposed into the short-term forcing signal \(\varepsilon (t)\) and the LTP signal M(t) (Eq. 1), and the LTP signal is further corrected by running the fractional integral statistical model (FISM) using an improved fractional integral order.

In this study, we applied this new approach to a recently published dataset of precipitation field reconstructions over China covering the past half millennium. This dataset is reconstructed using multiple proxies, i.e., the TRW chronologies that are mainly located in western China, and the dryness/wetness index (DWI) derived from historical documents that are mainly in eastern China. While the LTPs of many reconstructions in eastern China do not deviate significantly from those of the instrumental data, we found that the LTPs of the precipitation reconstructions are remarkably overestimated in western China. In this case, the resultant long-term drought events and dry conditions in western China last for several centuries (Fig. 6). After correcting the persistence, however, the historical wet-dry changes become less severe and the drought events last only for a few years or decades. By comparison, the new reconstructions with the persistence corrected seems to be able to provide us with a more realistic estimation of the past climates.

Fig. 6
figure 6

Similar to Fig.1, but shows the results of the regionally averaged reconstruction over the western China (see the black box in Fig. 2c). The gray bars represent the original reconstruction (Western Rec.), while the green bars represent the corrected reconstruction (Western Cor.). From the 31-years moving averages, remarkable historical long-term drought events are identified in the original reconstruction (the blue curve). While after correcting the persistence properties, the durations and magnitudes of the drought events become much shorter and weaker (the orange curve).

To obtain a reliable correction, one important precondition is to find a good reference in which reliable LTPs can be estimated. In this study, from the similar LTP exponents between the instrumental records and the purely DWI based precipitation reconstructions (Fig. S1), we argue that the LTPs measured from the instrumental records are reasonable references for correction. If this finding also holds for precipitations over other regions, we could easily apply this correction approach to other precipitation reconstructions (e.g., the HEPI reconstruction in Greece) and use the LTPs of observational precipitation records as references. Even if the LTPs from observational precipitation records cannot be used as references, this correction approach still can be applied as long as other potential references (e.g., paleo-model simulations or other independent proxies) are proved to be useful.

Here we focused on the overestimated persistence in the precipitation reconstructions. In fact, the overestimation of the LTP does not exist only in precipitation reconstructions. According to previous studies (Zhang et al. 2015), the redundant LTPs in TRW chronologies should be the main contributor to the overestimated LTP. Therefore, this new approach can also be applied to reconstructions of other TRW based variables (e.g. temperature reconstructions). In view of the non-climatic/mixed climatic signals in TRW records, it is highly suggested to revisit the previous TRW based reconstructions, i.e., evaluating the performance of different detrending/reconstruction methods with the LTP as a test bed, and if necessary, applying the correction approach. In addition to correcting reconstructions as a post-processing method, an even better application of this approach might be to combine with the current detrending methods, e.g., upgrade the current pre-whitening methods (e.g., the AR/ARMA model) by introducing the correction of LTPs.

Finally, we would like to emphasize that this new approach assumes that high frequency interannual climate variability is well captured by year-to-year tree ring changes, and that it focuses mainly on the correction of the long-term effects of these high frequency changes. Following this assumption, the short-term forcing signals \(\varepsilon (t)\) (see Eq. 1) extracted from the reconstructions are considered to be reliable. Strictly speaking, however, the annual changes in TRW may also include non-climatic signals such as variations due to the effect of competition. For example, considering a tree in a closed canopy forest, when a neighboring tree dies, the release from competition may induce a sudden increase of the growth rate of this tree. Current detrending methods, such as the smoothing spline, usually remove the low-frequency fluctuations caused by competition effects (Blasing et al. 1983). To better filter out the non-climatic signals on shorter time scales, further efforts are needed in the future.