ABSTRACT
The climate change and human activities significantly affect hydrological time series. Due to the mixed impacts of these factors on changing runoff time series, identifying the exact time of starting statistical change in the regime of runoff is usually complicated. The regional or spatial relationship among hydrologic time series as well as temporal correlation within multivariate time series can provide valuable information for analyzing change points. In this paper, a spatio-temporal multivariate method based on copula joint probability namely, copula-based sliding window method is developed for detecting change points in hydrological time series. The developed method can especially be used in watersheds that are subjected to intense human-induced changes. The developed copula-based sliding window method uses copula-based likelihood ratio (CLR) for analyzing nonstationarity and detecting change points in multivariate time series. To evaluate the applicability and effectiveness of the developed method, it is applied to detect change points in multivariate runoff time series in the Zayandehrud basin, Iran. The results indicate that the proposed method could locate three change points in the multivariate runoff time series (years 1985, 1996, and 2003), while the Cramer–von Mises (CvM) criterion method identifies only one of these change points (year 1985).
HIGHLIGHTS
A multivariate approach is proposed for detecting change points in runoff time series.
Copula-based likelihood ratio (CLR) is used for analyzing runoff nonstationarity.
The copula joint probability and sliding window methods are used in the approach.
The results of CLR are compared with those of the Cramer–von Mises (CvM) method.
The approach can be applied to watersheds subjected to intense human interventions.
INTRODUCTION
Climate change and human activities impose nonstationarity on hydrological variables. This can happen due to an increase in temperature and evaporation, land use changes, and implementation of water transfer projects, and other urban and industrial developments. A nonstationary time series no longer follows statistical properties or moments that it held before occurring nonstationarity. When a hydrological variable such as runoff discharge is nonstationary, predicting its variations, especially during extreme events of flood and drought, can become a very challenging task. In addition, most hydrological models work under the assumption that the time series is stationary. Thus, detecting and analyzing nonstationarities in hydrologic variables is necessary. Due to the mixed impact of climate change and human activities on hydrologic variables, especially runoff discharge, identifying the exact time of starting a statistical change in runoff time series is usually complicated. A change point (breakpoint) in a time series refers to a point where (when) the statistical characteristics, such as mean and standard deviation, vary abruptly. Many studies have been done in the field of identifying change points in time series. In an early research, McGilchrist & Woodyer (1975) addressed nonstationarity in an 88-year time series of annual rainfall in New South Wales, Australia, by developing a formal statistical test. Perreault et al. (2000) applied the univariate Bayesian approach for change point detection in the mean of multivariate series. They showed the performance of this method by applying it to six runoff time series in Quebec, Canada. In recent years, Ray et al. (2019) analyzed the homogeneity and the presence of a change point in the time series of average temperature for the period of 1941–2012 using three tests of cumulative deviation (Buishand 1982), standard normal homogeneity (Alexandersson 1984), and Wilcoxon rank-sum (Johnson & Bhattacharyya 1977). They showed that urbanization trends affected temperature changes. Xie et al. (2019) evaluated the performances of 12 parametric and nonparametric methods of change point detection in hydrologic time series. They stated that their results could be useful for the selection of a method for detecting change points and hydroclimatic variability. Wang et al. (2020) studied two types of estimators for detecting change points and concluded that penalization methods on complex data types, which involve fewer tuning parameters, are a better choice. Most of the previous studies on nonstationarity were based on univariate analyses (e.g. Villarini et al. 2009; Li et al. 2014a, 2014b; Zhang et al. 2016; Minaei & Irannezhad 2018; Zhang et al. 2018; Farsi & Mahjouri 2019; Farsi et al. 2020). In recent years, more studies have focused on multivariate analyses of nonstationary time series (Xiong et al. 2015). Some of these studies are Holmes et al. (2013), Cabrieto et al. (2017), Sundararajan & Pourahmadi (2018), and Akbari & Reddy (2019). Holmes et al. (2013) developed a method based on the Cramer-von Mises (CvM) criterion (Cramér 1928; von Mises 1928; Anderson 1962) test statistic for detecting change points in a multivariate time series. Most of the developed methods were intended to determine whether all observations of a time series were from a specified probability distribution function (PDF). However, those methods were applicable when there was only one change point in a time series. Xiong et al. (2015) presented a framework for detecting change points in a multivariate time series by applying CvM and copula-based likelihood ratio (CLR) methods. Their results showed that the CLR method performed better than the CvM method. Zhou et al. (2019) conducted a comparative analysis for detecting change points using nonparametric methods such as Pettitt, CvM, and Cumulative Sum (CUSUM). Their results showed that the CvM method did better at detecting change points. Kovács et al. (2020) proposed a seeded binary segmentation approach for identifying a single change point in a time series. This approach relied on a deterministic construction of background intervals, called seeded intervals. Alhathloul et al. (2021) investigated the annual and seasonal horizontal visibility rates based on data from 1985 to 2018 at 23 locations in Saudi Arabia by applying multiple trend analysis methods. They concluded that the visibility rates degraded usually corresponding to the high rates of warming. Morabbi et al. (2022) proposed a measure of similarity based on the location of change points. Their proposed measure was especially to detect regions where hydrological stations were not linearly correlated or the relation between their data changed with time. Previous research investigated the application of different statistics for detecting change points in a time series. In addition, temporal multivariate analyses of hydrologic nonstationarities have been done before. To the best of the authors' knowledge, an approach for detecting change points in a hydrologic time series, which considers regional or spatial information contained in a multivariate time series has not yet been proposed. Considering the spatial relationship among hydrologic time series as well as the temporal relationship within multivariate time series simultaneously can provide valuable information for analyzing nonstationarities in hydrologic time series. In this paper, we develop a new approach based on spatio-temporal multivariate analyses for detecting change points in hydrological time series, which can especially be used in watersheds that are subjected to intense human-induced changes. The proposed approach is developed based on CLR and CVM methods for analyzing nonstationarities and detecting change points in multivariate time series. To validate the suggested approach, human-induced changes in the watershed including land use changes, such as variation in the area of agricultural lands, and water transfer projects are carefully studied. To evaluate its applicability and effectiveness, we apply the developed methodology to detect change points in the runoff time series of the Zayandehrud river basin, Iran. The analyses are in two cases. First, by considering water transfer to the study area at the upstream of the hydrometric stations. Second, by subtracting the time series of the water transfer discharge from the observed time series of the river discharge at the hydrometric stations. The results of these two cases are compared.
Thus, the main questions of the research are: How does the copula-based sliding window (CSW) method perform at determining breakpoints in multivariate hydrological time series, compared to other methods? Does the proposed approach have acceptable performance in basins that are affected by significant anthropogenic changes? In the following sections, we describe the developed methodology (Section 2). Then, we present the main characteristics of the study area (Section 3). Next, in Section 4, we present the results of applying the proposed approach to the study area. Finally, concluding remarks are provided in the conclusion section.
METHODOLOGY
Investigating stationarity in runoff time series
Stationarity in a time series means that the statistical properties of that series remain constant over time. Based on this definition, the mean value function of a stationary time series () should be constant and not depend on time t (Aminikhanghahi & Cook 2017). There are several parametric and nonparametric (distribution-free) methods of investigating stationarity. Different types of nonstationarity can be observed in a runoff time series. In this section of the methodology, five common methods of detecting nonstationarity are used. These methods include both parametric and nonparametric tests and each of them is based on a different concept. Thus, nonstationarity in the runoff time series can be found with better. These methods are the standard normal homogeneity test (SNHT) (Alexandersson 1984), the Mann–Kendall (MK) test (Kendall 1938; Mann 1945), the Buishnad range (BR) test (Buishand 1982), the Augmented Dickey–Fuller (ADF) test (Dickey & Fuller 1979), and the Pettitt test (Pettitt 1979). Details on these tests are briefly discussed in the Appendix.
Identifying change point(s) in runoff time series based on univariate analysis
In the univariate analysis, change point locations in a time series can be either identified using statistical tests or analysis of the PDF of the data.
Change point detection in runoff using statistical tests
In this paper, four different methods including tests of SNHT, BR, Pettitt, and variable fuzzy sets (Li et al. 2014a, 2014b) are applied to detect a change point in runoff data. The null hypothesis of SNHT, BR, and Pettitt is that the series is homogeneous and no change point is detected. Under the alternative hypothesis, the series is nonhomogeneous and there is a change in the mean of the series. The SNHT is sensitive in detecting the changes near the beginning and the end of the series. BR and Pettitt tests are more likely to identify the break in the middle of the series. Moreover, the SNHT and BR tests assume that the series is normally distributed, whereas due to being a nonparametric rank test, in the Pettitt test, this assumption is not required (Kang & Yusof 2012). In contrast, the method of variable fuzzy sets considers a reference period in the time series and compares it with other periods to detect change points (Li et al. 2014a, 2014b). All of these methods only use one statistical parameter, which contains a portion of time series properties (the reader is referred to in the Appendix). In the current methodology, we try to use as much information contained in the time series as possible for detecting change points.
Change point detection in runoff time series by analyzing changes in the probability distribution function
θi denotes the vector of parameters of the selected PDF in window i. If θi is statistically constant for all windows, then it is inferred that there is no change point in the runoff time series.
shows the location of the change point in the runoff series which can be more than one in the time series. Change points in the runoff series can be located better by plotting parameters versus each window i.
Spatio-temporal multivariate analysis of change point(s) in runoff time series
For conducting the spatio-temporal multivariate analysis, we consider different hydrometric stations. Rainfall and runoff time series from existing stations are collected. The proposed method of the CSW method and the spatio-temporal method based on the Cramér–von Mises (CvM) are discussed in detail in the following sections.
Change point(s) detection in runoff using the CvM method
Spatio-temporal analysis of change point(s) in runoff using CSW method
In this paper, to simulate the dependence structure between runoff time series at two hydrometric stations, the Archimedean copula function, a well-known function in hydrological applications, is used. For fitting copula to data, the parameters of each type are estimated. This class of copulas can be fitted to a variety of dependence structures and does not require the marginal distribution functions to be the same (Embrechts et al. 2003). Table 1 shows four kinds of Archimedean copulas used in this paper.
Name . | Function . | Limit . |
---|---|---|
Galambos | ||
Gumble–Hougaard | ||
Clyton | ||
Frank |
Name . | Function . | Limit . |
---|---|---|
Galambos | ||
Gumble–Hougaard | ||
Clyton | ||
Frank |
, vector of copula parameters; u, marginal distribution function1; v, Marginal distribution function2; , copula joint density function.
– The best copula function is selected for multivariate series, which is constructed based on marginal runoff time series.
– Windows of a predetermined size (say 10 years or 120 months) are moved along the multivariate time series.
– For each window, the values of parameters of the best-fitted distribution () are computed.
– Finally, the difference between the copula PDF parameters of each window and that of the whole time series is determined. If the difference is significant, the null hypothesis () is rejected and a change point is detected. The null hypothesis states that no change occurs through dependence structure over time, based on copula parameters (Equation (1)).
If is rejected, then is true, which means there is at least one change point in the time series. At this point, the values of the PDF parameters change abruptly. Parameters and show the value of distribution function parameters, respectively, before and after the change point, assuming that is the change point in the dependence structure.
THE STUDY AREA AND DATASETS
RESULTS AND DISCUSSION
Values of autocorrelation (lag-1 to lag-4) for the monthly time series of Ghaleh-Shahrokh and Eskanderi stations, which are used in the analysis, are shown in Table 2. It is found that the lag-1 series pass the test for significant correlation at a 5% level. Thus, the lag-1 autocorrelation in the individual series is significant. The long-term persistence of each series is investigated using the Hurst exponent (H) (Hurst 1951). The results show that short-term dependence exists in all series, while no significant long-term dependence is seen. In addition, all calculated values of Hurst exponent almost equal 0.5, which indicates a completely uncorrelated series.
Variable . | Autocorrelation . | Hurst exponent . | |||
---|---|---|---|---|---|
Lag1 . | Lag2 . | Lag3 . | Lag4 . | ||
Rainfall in Ghaleh-Shahrokh station | 0.189 | −0.081 | 0.00002 | 0.023 | 0.456 |
Runoff in Ghaleh-Shahrokh station | 0.405 | 0.249 | 0.045 | −0.087 | 0.524 |
Evaporation in Ghaleh-Shahrokh station | 0.618 | 0.312 | 0.191 | 0.079 | 0.467 |
Runoff in Eskanderi station | 0.518 | 0.254 | 0.026 | −0.088 | 0.465 |
Variable . | Autocorrelation . | Hurst exponent . | |||
---|---|---|---|---|---|
Lag1 . | Lag2 . | Lag3 . | Lag4 . | ||
Rainfall in Ghaleh-Shahrokh station | 0.189 | −0.081 | 0.00002 | 0.023 | 0.456 |
Runoff in Ghaleh-Shahrokh station | 0.405 | 0.249 | 0.045 | −0.087 | 0.524 |
Evaporation in Ghaleh-Shahrokh station | 0.618 | 0.312 | 0.191 | 0.079 | 0.467 |
Runoff in Eskanderi station | 0.518 | 0.254 | 0.026 | −0.088 | 0.465 |
Table 3 shows the results of change point detection using statistical tests for the Ghaleh-Shahrokh runoff series. Inter-basin water transfer projects have been implemented upstream of Ghaleh-Shahrokh station and a large amount of water has been transferred to the studied basin. The impact of these projects could significantly alter the location of change points in the runoff time series. Moreover, such an alteration could have occurred due to other human-induced changes, such as agricultural developments. Therefore, in this paper, we investigate the stationarity of the runoff time series considering two cases: case 1, by considering water transfer to the study area at the upstream of the hydrometric stations (i.e. Ghaleh-Shahrokh and Eskandari stations), case 2, by subtracting the time series of the water transfer discharge from the observed time series of the river discharge at these hydrometric stations.
Statistical tests . | SNHT . | Pettitt . | Variable fuzzy sets . | Buishand . |
---|---|---|---|---|
Detected change point in case 1 | 2006 | 2006 | 2007 | 2006 |
Detected change point in case 2 | 1985 | 2006 | 2007 | 2006 |
Statistical tests . | SNHT . | Pettitt . | Variable fuzzy sets . | Buishand . |
---|---|---|---|---|
Detected change point in case 1 | 2006 | 2006 | 2007 | 2006 |
Detected change point in case 2 | 1985 | 2006 | 2007 | 2006 |
As seen in Table 3, the results of the statistical tests used for detecting a change point in the runoff time series do not differ in case 1 and case 2, except for the case of SNHT. Also, according to all tests, the year 2006 seems very likely to be a change point.
For detecting change points in the multivariate hydrological series, two approaches are taken to create a multivariate time series; first, by selecting a time series of various statistics in a specific station and second, by selecting a time series of the same variable in adjacent stations. In the first approach, a trivariate time series including precipitation (Pr), evaporation (Er), and runoff (Q1) at Ghaleh-Shahrokh Station are selected. In the second approach, a bivariate series including the time series of runoff at Ghaleh-Shahrokh and Eskandari stations (Q1, Q2) is used. The results of applying the CvM method are shown in Table 4.
. | Variables . | D(k,X) . | Sk . | Sn . | Λ . | Year . |
---|---|---|---|---|---|---|
1 | Runoff, rainfall and evaporation at Ghaleh-Shahrokh | −1,631 | 31,459.29 | 31,459.29 | 18 | 1991 |
2 | Runoff of Ghaleh-Shahrokh and Eskanderi | 17.9424 | 7.154 | 7.154 | 14 | 1986 |
. | Variables . | D(k,X) . | Sk . | Sn . | Λ . | Year . |
---|---|---|---|---|---|---|
1 | Runoff, rainfall and evaporation at Ghaleh-Shahrokh | −1,631 | 31,459.29 | 31,459.29 | 18 | 1991 |
2 | Runoff of Ghaleh-Shahrokh and Eskanderi | 17.9424 | 7.154 | 7.154 | 14 | 1986 |
Sk is a form of Cramer–von Mises statistic (Genest & Favre 2007). Sn and λ are, respectively, a critical value and a function that determine the maximum for test statistic (Sk).
Time series . | Autocorrelation coefficient . | Tau–Kendall coefficient . |
---|---|---|
Average runoff in case 1 | 0.53 | 0.0242 |
Average runoff in case 2 | 0.71 | 0.0263 |
Time series . | Autocorrelation coefficient . | Tau–Kendall coefficient . |
---|---|---|
Average runoff in case 1 | 0.53 | 0.0242 |
Average runoff in case 2 | 0.71 | 0.0263 |
The resulting change points in the runoff based on different approaches are shown in Table 6. In this step, to further analyze the detected change points, we investigate the observed changes in the hydrologic data regarding the human-induced changes in the study area during the last decades. Based on the observations, human activities play a dominant role in reducing the runoff of the Zayadehrud River. Human activities that occurred during recent decades included agricultural development, changes in land use and cropping patterns, and inter-basin water transfer to the study area. Generally, variations in groundwater level could be one other factor influencing the runoff. However, due to the mountainous nature of the majority of the study area, the groundwater withdrawal wells are very few and a great portion of water demands in this region is supplied from surface water resources.
. | Methods . | Change point (year) (for case 1) . | Change point (year) (for case 2) . |
---|---|---|---|
Univariate analysis | SNHT | 1985 | 2006 |
Buishand | 2006 | 2006 | |
Pettitt | 2006 | 2006 | |
Variable fuzzy sets | 2007 | 2007 | |
Best distribution parameter | – | 1985–1996–2006 | |
Multivariate analysis | CvM | – | 1985–1991 |
CSW method | 1985–1996–2006 | 1996–2003 |
. | Methods . | Change point (year) (for case 1) . | Change point (year) (for case 2) . |
---|---|---|---|
Univariate analysis | SNHT | 1985 | 2006 |
Buishand | 2006 | 2006 | |
Pettitt | 2006 | 2006 | |
Variable fuzzy sets | 2007 | 2007 | |
Best distribution parameter | – | 1985–1996–2006 | |
Multivariate analysis | CvM | – | 1985–1991 |
CSW method | 1985–1996–2006 | 1996–2003 |
Inter-basin water transfer began in 1975 through three different tunnels launched in subjective years 1975, 1987, and 2007 (Figure 6). According to Figure 6 and the observed runoff data, the impact of inter-basin water transfer projects on runoff change in some years is obvious. However, agricultural development resulting from inter-basin water transfer and a drastic increase in water withdrawal have compensated for this impact in some years. To study agricultural development over time, we derived the time series of the area of cultivated lands using Moderate Resolution Imaging Spectroradiometer (MODIS) satellite images with a resolution of 250 m × 250 m. Results of analyzing the variations of agricultural lands based on the Normalized difference vegetation index (NDVI) index show that in addition to a trend, this variable had two incremental trend stages in 1985 and 2006 (Figure 12). Analyzing the time series of inter-basin water transfer discharge, the area of agricultural lands, and runoff data (when eliminating the water transfer discharge), we can infer that the change point detected in 1985 mainly occurred because of sudden agricultural development stemming from increasing the discharge of inter-basin water transfer.
After an increasing trend in the area of agricultural lands, in 1996, a change point occurred followed by a downward trend in the area of agricultural lands. This can be partially attributed to the decline in precipitation as well as the water transfer discharge. Also, based on the time series of the area of agricultural lands obtained from satellite data, the area of orchards increased since 2006. On the contrary, crop cultivation has decreased since this year.
SUMMARY AND CONCLUSION
Climate change and anthropogenic factors are two main factors contributing to the nonstationarity of the time series of hydrological variables. These factors mostly introduce deterministic terms to the stochastic time series. Change points in the time series are one of these deterministic terms. In this paper, change point(s) in the time series of runoff were detected by developing a method based on multivariate analysis along with applying univariate methods. In multivariate analysis, a CSW method was developed and applied to bivariate runoff time series in two hydrometric stations and the results were compared to those of the CvM method. Moreover, the results were evaluated by analyzing the time series of land use changes caused by agricultural activities as well as the time series of precipitation, temperature, and evaporation. Three change points in 1985, 1996, and 2006 were discovered. Analyzing the time series of inter-basin water transfer discharge, the area of agricultural lands, and runoff data, the change point detected in 1985 was attributed to sudden agricultural development resulting from the increase in the discharge of inter-basin water transfer. The change point in 1996 occurred after an increasing trend in the area of agricultural lands, followed by a downward trend. This was partially associated with the decline in the precipitation and the water transfer discharge. Also, the third change point could be caused by the increase in the area of orchards since 2006. The results show that the proposed CSW method could detect change points with high accuracy in comparison to other change point detection methods. One of the limitations of this method is that the copula analysis for change point detection is based on the correlation of marginal series. Thus, for more reliability of the results of this method, variables in multivariate series must have an acceptable correlation. In the regional multivariate analysis, using adjacent hydrometric stations results in a more reliable outcome.
In our study area, there was no significant groundwater resource in the region. In future studies, we suggest investigating the variation of groundwater level along with surface runoff discharge and considering the interactions between surface and groundwater in the multivariate analysis for detecting change points. Moreover, it is suggested to use remote sensing data to increase the accuracy of water balance components, such as evapotranspiration.
AVAILABILITY OF DATA AND MATERIALS
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
ETHICAL APPROVAL
The authors declare that this manuscript and the results obtained from this research have not been published or submitted elsewhere. Also, the authors declare that all three participants of this work have contributed to preparing the manuscript, and their names are mentioned as the authors of the paper.
CONSENT TO PARTICIPATE
All three authors consent to participate in this work.
CONSENT TO PUBLISH
All three authors give their consent to publish this work.
AUTHORS CONTRIBUTIONS
The contributions of the authors are detailed as follows: M. O. developed the methodology, arranged the software, prepared codes, and prepared the original draft. N. M. conceptualized the whole article, supervised the work, validated the results, reviewed, and edited the text. S. H. validated the data, prepared some figures, and wrote the original text.
FUNDING
The authors declare that no funds, grants, or other types of financial support were received during the research or preparation of this manuscript.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.
Root Mean Square Error
Iran Chamber of Commerce, Industries, Mines, and Agriculture