Climate change impact on extreme value and their frequency distribution function in a karst basin, Southwest China


 Studying extreme meteorology and its frequency under climate change is helpful to guide flood and drought control. The original achievements and objective of this study are to further contribute to the literature on how to analyze the impact of climate change on extreme rainfall and extreme temperature more reasonably and comprehensively for a karst basin. The Mann–Kendall method, Heuristic segmentation method, cross-wavelet analysis method, generalized extreme value (GEV) model, and generalized Pareto distribution (GPD) model were applied in this paper. The 55-year (1963–2017) extreme rainfall and temperature data recorded in the Chengbi River Basin were applied. The results show that extreme rainfall showed a downward trend (−0.169 and −8.735 mm/10a), while the trends of extreme temperatures were not obvious (Sen's slope estimate is 0). The mutation points range from 1981 to 2002 and the mutation point of extreme rainfall series is earlier than that of extreme temperatures. Compared with the GEV model, the parameters of the GPD model show a smaller variation before and after climate change, and the extreme meteorology values corresponding to the same recurrence period show a decreasing trend after climate change. The performance of GEV and GPD models after climate change is generally more fit than that before climate change.

function from the perspective of mathematical statistics to study the change characteristics of extreme events, making up for the shortcomings of physical and dynamic methods to a certain extent (Bella et al. 2020). The frequency analysis method is based on the asymptotic extreme value theory, using a probability distribution to fit the empirical distribution of extreme temperature and precipitation and to statistically analyze the temperature and precipitation threshold under the specified recurrence period (Mo et al. 2019), and the frequency analysis can reflect the uncertainty of hydrologic events objectively (Zhou et al. 2011). Li et al. (2019) found that the generalized extreme value (GEV), Pearson-III, three-and two-parameter log-normal had generally the best cumulative distribution functions for the nine extreme precipitation indices in China. Furthermore, eight theoretical probability distribution models were selected to fit the extremely high-temperature index in the Heihe River Basin (Wang et al. 2017).
In addition, the formation and development of ENSO (El Niño-Southern Oscillation) events cause atmospheric circulation in the northern hemisphere and climate anomalies in China, generating global-scale climate oscillations, which have a certain impact on some regional rainfall and temperature (Gao et al. 2020), and solar activity will also affect regional rainfall and temperature distribution by affecting regional radiation to change its hydrothermal equilibrium . The study of the correlation between climate indices and extreme meteorology contributes to the study of the causes of extreme rainfall and extreme temperatures in basins under climate change conditions. For southwest China (e.g., Guangxi), karst topography and landforms are widely distributed, with a continuous distribution area of 540,000 km 2 . In recent years, extreme rainfall and extreme temperature events occur frequently in the Guangxi karst basin. Compared with non-karst areas, the dual structure of the surface and underground water system in karst areas often leads to strong leakage of rainwater, shallow and discontinuous soil layer, low soil volume, and low water storage capacity (Gao et al. 2015), which is more likely to cause flooding, soil erosion, and landslides under the extreme meteorological influence .
In summary, although many studies have obtained reasonable results of extreme precipitation and extreme temperatures, some problems remain to be solved. The major difficulties and challenges are as follows: (1) It is well known that climate indices are closely related to extreme weather, but how will their relationship changes before and after climate change, and how can the contribution of climate change to these changes be quantified? (2) How will climate change affect the analysis of the frequency of extreme rainfall and extreme temperature? Therefore, the original contributions and objective of this study are to further contribute to fill the gap in the knowledge of how to analyze the impact of climate change on extreme rainfall and extreme temperature more reasonably and comprehensively for a karst basin. method was applied to calculate the areal value of the used data. In addition, the Multi ENSO Index (MEI) and the Ocean Nino Index (ONI) are provided by the climate prediction center of the US Atmospheric Administration (https://www.noaa. gov/), and sunspot data are provided by the Royal Observatory of Belgium (http://sidc.oma.be/silso/datafiles).
The Chengbi River Basin was selected in this study on account of these features. First, the Chengbi River Basin is significant due to its potential for hydropower electricity, agriculture, urban, and industrial water supply. Second, the Chengbi River Basin has typical karst characteristics. Finally, the Chengbi River Basin has a sufficient historical hydrometeorological observations to meet the needs of this study.

METHODOLOGY
The WMO (World Meteorological Organization) proposed a set of extreme climate indices, which became the unified standard for climate change research. For the purpose of this study and to reduce the burden of experimental calculations, the R25 (daily precipitation of .25 mm), the RX1D (maximum 1-day precipitation amount), the SU (daily maximum temperature of .25°C), and the TXX (annual maximum (AM) temperature) were applied in this paper. The modified over-whitening Mann-Kendall (MM-K) trend test method (Xu et al. 2006), Sen's slope estimation method (Liu et al. 2016), and the linear regression method are used for trend analysis to explore whether extreme weather indicators show corresponding trends as a result of climate change. Then, the Heuristic segmentation method is used to diagnose the mutation point of the extreme meteorology series, and the mutation point determines the time point at which climate change starts to affect extreme meteorology. the time series before the mutation point is considered as the series that has not been affected by climate change and named it as the pre-climate change series, whereas the time series after the mutation point is considered as the series that has been affected by climate change and named it as the post-climate change series. The cross-wavelet method was then used to explore the differences in the relationship between climate indices (the MEI, the ONI, and sunspots) and extreme meteorology before and after climate change. Finally, the GEV (Fisher & Tippett 1928), the generalized Pareto distribution (GPD) , and the Kolmogorov-Smirnov (K-S) test method (Greene & Wellner 2016) were selected in this study to study the influence of climate change on the frequency distribution of extreme weather elements from three perspectives: frequency distribution model parameters, reproducibility period, and model fit effect since they are well established and widely accepted in water hydrological science. Due to space constraints, only the Heuristic segmentation method is briefly introduced in this paper. Finally, the framework of this study is shown in Figure 2.

Heuristic segmentation method
The Heuristic segmentation method was proposed by Bernaola-Galvan et al. (2001). It overcomes the disadvantages of the previous detection methods for non-stationary time series detection and reduced the burden of calculation in the process of segmentation (Wang et al. 2009). Suppose a sliding pointer is moved from the left to the right of the time series, and then at each position of the pointer the left and right subsets of the pointer are the computed average value, noted as μ 1 and μ 2 , respectively. The difference between the averages of the two series μ 1 and μ 2 under the statistical significance is estimated by Student's t-test statistic as follows (Huang et al. 2014): where N 1 and N 2 denote the number of two series points around point I, respectively, and s 1 and s 2 denote the standard deviations of the two series around point i. Moving the pointer along with a given time series, a statistical quantity t is calculated to estimate the difference between the average of the right and left time series, the larger the t-value, the more pronounced the difference between the two sub-series to the left and right of the point. When the t-value reaches its maximum, this point is considered to be well segmented and the statistical significance P(t max ) corresponding to t max is then calculated as follows: where h ¼ 4:19 ln N À 11:54, d ¼ 0:40, obtained by the Monte Carlo simulation method, N denotes the length of the study series and v ¼ N À 2, of which I x (a, b) is an incomplete β function. A threshold value P 0 is pre-set, and when P(t max ) ! P 0 , then segmentation is performed at that point to split this series into sub-series with large differences between the left and right mean values, otherwise no segmentation is performed. Next, the new series is iterated and the above operation is repeated until the length of the subsequence is less than l 0 (l 0 is the minimum segmentation scale), then the segmentation stops. As a result, the original series is split into several sub-series with different mean values, and the split point is the mutation point. Typically, the range of P 0 is 0.5-0.95, and the value of l 0 should not be less than 25 (Gong 2006). The criteria selected for this study are P 0 ¼ 0.95 and l 0 ¼ 25.

Frequency analysis method
In this study, the AM sampling method (Dodangeh et al. 2019) was used to construct the RX1D series and the TXX series, and the peak over threshold (POT) sampling method (Marty & Blanchet 2012) was used to construct the R25 series and the SU series. The GEV model refers to the probability distribution of the maximum or minimum value, and it can avoid the deficiency of using a single distribution, which makes it more reliable than the traditional method (Chen 2008). The GPD is specially used to describe the probability distribution pattern of the POT, and the simulation results are closer to reality ( Jiang et al. 2009). Consequently, for the series obtained by the AM sampling method, the GEV model was applied to analyze the frequency, and for the series obtained by the POT sampling method, the GPD model was used to analyze the frequency.

Trends of extreme precipitation and extreme temperature
The MM-K trend test method can implement the removal of autocorrelation compared with the well-established M-K method (Zhang et al. 2018). Sen's slope estimation was applied to calculate the level of trend, as it has been used extensively in meteorological time series and equally applicable where data gap exists (Atta & Dawood 2017). Linear regression is an important and commonly used parametric method for identifying the monotonic trend in a time series (Tabari & Marofi 2011).
The results of extreme precipitation and extreme temperature trends are shown in Figures 3 and 4, and Table 1. From Figure 3, we can imply that the regression slopes of RX25 and RX1D are À0.0169 and À0.8735, respectively, indicating that the extreme rainfall in the Chengbi River Basin decreases at the rate of 0.169 and 8.735 mm/10a. In addition, the RX1D sequences are highly variable (Figure 3(b)), and the value ranges from 31.34 mm (2005) to 238.95 mm (1967). The SU25 series (Figure 4(a)) is relatively stable. However, the TXX series (Figure 4(b)) shows a decreasing trend with a regression slope of À0.0679 and a mean of 37.5°C, with a maximum value of 44.5°C (1991). Table 1 shows the results of trend  Uncorrected Proof significance analysis, from which it can be seen that all the series show a significant downward trend except for the SU25 series. Thus, we can see that climate change has a great influence on RX1D, RX25, and TXX series (significant downward trend), while SU25 is more resilient to climate change.

Mutation point determination
The results of the mutation point analysis for extreme rainfall and extreme temperatures are shown in Figure 5. As can be seen from Figure 5(a), in the first iteration and segmentation process of the R25 series, since its P(T max ) ¼ 1 . P 0 , the mutation point is identified as 1983. In the second and third iteration and segmentation process, due to its P(T max ) ¼ 0.468 , P 0 , therefore no mutation point is identified. At the end of the third iteration, when the segment length is less than l 0 , and the segmentation process ends, it can be concluded that the mutation year of the R25 series is 1983. Figure 5(b) shows the segmentation of the RX1D series. During the first segmentation and iteration, a mutation point (1981) is identified with P(T max ) ¼ 0.997 . P 0 . In the second segmentation and iteration process, P(T max ) ¼ 0.540 , P 0 , so no mutation point is identified. Because the segment length is greater than l 0 , the third segmentation and iteration are continued. However, the third segmentation and iteration also do not recognize the mutation point with P(T max ) ¼ 0.495 , P 0 . At this point, the segment length is still greater than l 0 , so the fourth iteration of the segment is performed but still no mutation point with P(T max ) ¼    (1964 and 2002) in the SU sequence ( Figure 5(c)) and one mutation point (1997) in the TXX series ( Figure 5(d)).
To sum up, for the R25 series, the mutation point is 1983, 1963-1983 is regarded as the pre-climatic change series, and 1984-2017 is regarded as the post-climatic change series; for the RX1D series, the mutation point is 1981, 1963-1981 is regarded as the pre-climatic series, and 1982-2017 is regarded as the post-climatic series; for the TXX series, the mutation point is 1997, 1963-1997 is regarded as the pre-climatic change series, and 1998-2017 is regarded as the post-climatic change series. It is worth noting that mutations too close to the start year of the series are relatively unreliable and should be discarded as suggested by Li et al. (2014); therefore, for the SU sequence, the mutation point is 2002, is regarded as the pre-climatic change series, and 2003-2017 is regarded as the post-climatic change series.

Impact of climate change on the correlation between climate indices and extreme meteorology
The cross-wavelet transform can preferably reveal the detailed correlations between two particular time series in both time and frequency domains (Torrence & Compo 1998). It is significant to choose representative climate indices before research. The ENSO is the Earth's strongest climate fluctuation on inter-annual timescales. The MEI is considered as the most representative as it links six different meteorological parameters measured over the tropical Pacific, and the ENSO has been gauged using the ONI (Mazzarella et al. 2013). Besides that, the solar activity is a major factor of global climate change, and sunspots are the most basic and obvious part of solar activity (Gray et al. 2009).
The impacts of climate change on the correlation between climate indices and extreme meteorology are shown in Figures 6-9. As can be seen in Figure 6(a), before climate change, the MEI has a significant negative correlation with R25, a 2.4-4.2 year significant main period during1963-1974, and the mean phase of the significant regions is about 135°, implying that the MEI is leading R25 by nearly 1.5 years. In contrast, after climate change (after 1983), the MEI has a significant positive correlation with R25, during 1984-2000 the significant main period is 3.1-5.8 years, and the mean phase of the significant regions is about 45°, indicating that the MEI is lagging the R25 by nearly 2.1 years. The correlation between the R25 and ONI (Figure 6(b)) is similar to the correlation with the MEI, with the difference that during 1982-1990, there is a significant main period of about 12.8-13.1 years and have a positive correlation. Figure 6(c) shows the effect of sunspots on R25 before climate change, with 8.5-14 years of a significant main period during 1966-1983, and the average phase of the significant regions is about 90°, indicating  that sunspots are lagging R25 by nearly 2.5 years. After climate change, there is also a positive correlation between R25 and sunspots, and during 1984-2010, there is a significant period of about 8.3-14 years. Also, the average phase of the significant region is about 25°, indicating that the sunspot index is leading R25 by nearly 1.9 years.

Impact of climate change on frequency distribution parameters
To explore the influence of climate change on parameters of the probability distribution model, the GEV model, and the GPD model, the parameters of RX1D, TXX, R25, and SU series before and after climate change are estimated by the maximum likelihood estimation (MLE) method, and the results are shown in Tables 2 and 3. The reasonable determination of parameters' threshold is significant. For the sake of brevity, more details are recorded in the following literatures: Zhao et al.    Uncorrected Proof (2019) and Huo et al. (2014). For the GEV model (Table 2), after climate change, all parameters show a decreasing trend except for the k parameter of the TXX series which shows an increasing trend. Table 3 shows the changes in GPD model parameters before and after climate change. From Table 3, it can be seen that the parameters of the SU series before and after climate change have not changed significantly, but the two parameters (k and α) of the R25 sequence after climate change are smaller than those before climate change. It is noteworthy that the k parameter of the R25 series becomes negative after climate change. Therefore, compared with GEV model parameters, GPD model parameters change slightly, which also shows that the GPD distribution model has a strong anti-interference ability. In summary, climate change affects the parameters of the probability distribution model, and the effect on the parameters of the GEV model is more significant than that on the GPD model parameters.

Impact of climate change on the return period
The effects of climate change on the return period are shown in Figures 10 and 11, and Tables 4 and 5. Figure 10 shows the fit of the GEV model to the RX1D and TXX series before and after climate change, and Table 4 shows the return period of the RX1D and TXX series fitted by the GEV model before and after climate change. As shown in Table 4, there is a downward trend in the RX1D and TXX values corresponding to the same return period after climate change, and the RX1D value changes more significantly than the TXX value. Taking the 100-year return period RX1D as an example, the RX1D value   Figure 11 | Fitting results of GPD distribution before and after climate change for the R25 sequence (a,b) and the SU sequence (c,d). Uncorrected Proof before climate change is 294.6 mm, and after climate change, the RX1D value decreased to 146.6 mm. Figure 11 shows the fitting results of the GPD model before and after climate change for R25 and SU series, while Table 5 shows the results of the return periods for R25 and SU series before and after climate change. From Table 5, it can be clearly seen that the R25 values corresponding to the same return period show a significant downward trend after climate change. Taking the 100-year return period R25 value as an example, the R25 value before climate change is 195.7 mm, while it decreases to 100.6 mm after climate change. Compared with extreme rainfall, the extreme temperature index (SU) corresponding to the same return period has little change before and after climate change, with the maximum change of 0.5°C (10-year return period) and the minimum change of 0.1°C (100-year recurrence period).

Impact of climate change on the model performance
To further explore the impact of climate change on the fitting effect of the GEV distribution model and the GPD distribution model, the K-S test is used to evaluate the fitting effect of the two models before and after climate change. Figure 12 shows the K-S test results before and after climate changes. The results show that both the fitting of GEV and GPD before and after climate change can pass the K-S test at α ¼ 0.05. Regarding before and after climate change, except for the RX1D series, the fitting effect of GEV and GPD models after climate change is better than that before climate change. Note that the K-S test statistic for the GPD ranges from 0.0377 to 0.0712, whereas the K-S test statistic for GEV distribution ranges from 0.0762 to 0.1506, which indicates that the optimal distribution of extreme rainfall and extreme temperature series in this study is GPD distribution.

SUMMARY AND CONCLUSIONS
The main results are as follows: (1) the extreme rainfall shows a significant downward trend, while for extreme temperatures, the TXX values show a slight downward trend, and the SU value shows a more stable trend.
(2) The mutation points in extreme rainfall sequences (1981 and 1983) are earlier than those in extreme temperature (1997 and 2003). (3) Climate change has an impact on the relationship between climate indices and extreme meteorology. The impact is mainly in terms of the change in the positive and negative correlation, the changes in the magnitude of the main cycle, and the changes in the phase angle before and after climate change. (4) With the change in time and climate, the parameters of the GPD model show a smaller change than the GEV model. After being affected by climate change, the extreme meteorology index values corresponding to the same return period tend to decrease. In addition, except for the RX1D series, the fitting effect of GEV and GPD models after climate change is better than before. Trend analysis shows that extreme rainfall and extreme temperature in the Chengbi River Basin have a significant downward trend. The result is consistent with Sun et al. (2018) and Wang et al. (2013aWang et al. ( , 2013b who found that the northwestern part of Guangxi (Baise) is a low-value area for extreme rainfall, and extreme rainfall and extreme temperature show a decreasing trend. However, it is worth noting that our results are quite different from those of other basins in China. For example, some studies found that extreme temperature and extreme precipitation in most parts of China show an overall upward trend Figure 12 | Goodness-of-fit by the K-S test. (Ji & Duan 2020;Kong 2020). In the Qinghai-Tibet Plateau, the general trends of temperature change have been rising, with an especially significant warming trend since the late 1990s (Wang et al. 2015). The reason may be that the trends of extreme precipitation and extreme temperature in China have obvious regional characteristics in the context of global warming because of the country's vast territory and complex climate (Gao & Xie 2014). Besides that, the reason for this difference may lie in the characteristic of the karst land. The rock in the karst land is soluble, which makes it hard to retain the moisture; accordingly, the relevant feature of evaporations will be different from other place. Also, the evaporation will change the atmosphere, thereby resulting in the change of local hydrological processes (Wang et al. 2013a(Wang et al. , 2013b. It is not hard to figure out that the mutation points of climate change are mainly centralized in the 1980s-1990s in this study, at the same time, many studies have indicated that the 1980s-1990s was a more significant period in which the climate was impacted by global warming in China (Chen et al. 2004;Wang et al. 2004). In addition, modeling techniques in hydrological prediction have been widely applied in recent years in hydrological sciences (Wu & Chau 2013;Zhao et al. 2018;Shamshirband et al. 2020). However, due to the influence of climate change, the relevant hydrological variables will likely show increasing or decreasing trends. Therefore, it is recommended that local policy-makers and hydrological managers fully consider the unique effects of climate change in the karst area when using the modulated techniques for hydrological forecasting or water hazard management control.
Also, it is interesting to find that the fitting effect of the model is affected to some extent when the amount of sequence data is reduced. The GPD model is less affected by climate change which may be due to its fewer number of parameters and a sufficient amount of sequence data. Besides that, the GPD belongs to the short-tailed distribution, in the study of Chen & Ren (2019), the conclusion was made that the GPD model is generally superior to the GEV model. There are two outliers in the fitting result graph. It is certain that the outliers in a distribution fitting are inevitable. Through further comparison, we figured it out that the distribution fitting diagrams of other authors also have outliers. What's more, the outliers do not affect the accurate judgment of the results.
The novelty of this paper lies in the use of the Heuristic segmentation method which can reduce the deviation that comes from the non-stationary of the time series, and make the results of mutation analyses more reasonable and accurate. A crosswavelet method is used to analyze the influence of climate change on the correlation between climate index and extreme meteorology, which can tell us whether the correlation varies before and after climate change. In addition, this study provides a comprehensive analysis of the effects of climate change on the model parameters, return periods, and model fit effects in extreme frequency analysis.
The key assumption stated in this study is that: The series before the mutation point is regarded as the pre-climate change series, when the impact of climate change on extreme meteorology is negligible. The series after the change point is regarded as the post-climate change series, when the impact of climate change on extreme meteorology becomes significant and the extreme meteorology is affected by climate change. Undeniably, the assumption does have an influence on the interpretation of results. In this study, the plausibility of this assumption is addressed through the following aspects. First, the mutation point test is widely applied to analyze the effects of climate change and it generated fruitful results in the field of hydrometeorological science. Second, this assumption can also be rationalized by detecting whether variation is significant before and after the change point. Taking the 100-year return period RX1D as an example, as shown in Table 4, the RX1D value decreased from 294.6 to 146.6 mm after climate change.
However, some shortcomings need to be addressed. In this study, we concentrate on only one basin in the Guangxi Province due to the lack of data, and it is not realistic to extrapolate the findings of this study to larger spatial scales before we could collect sufficient data from other basins. Moreover, consistency in series lengths before and after climate change was not ensured during the study, which may introduce some errors in the results. However, there is no authoritative method to settle the potential interference caused by the different lengths of the sequence before and after climate change. Therefore, these problems are worth exploring and solving in future research.