Abstract

Aiming at revealing the co-integration under structural change in the long-run relationship between rainfall and runoff time series at Tangnaihai Hydrological Station in the source area of the Yellow River, and improving the accuracy of annual runoff prediction, co-integration theory and structure change co-integration theory are introduced respectively. The error correction models of rainfall and runoff in these two cases are constructed. The results show that reservoir construction and climate change can cause structure change in the long-run relationship between rainfall and runoff in the source area of the Yellow River. The breakpoints appeared in 1989 and 2002, in which the breakpoint in 1989 is mainly effected by reservoir construction while in 2002 it is effected by rainfall changes. Meanwhile, the error correction model with structural change shows that the impact of rainfall on runoff decreases from 1989 but increases from 2002. Finally, for the prediction of runoff in the next five years, the mean absolute percentage errors of the prediction models without and with breakpoints are 11.04% and 7.08% respectively, and this shows that the error correction model with structural change has the higher runoff prediction accuracy.

INTRODUCTION

Runoff is an important hydrological variable, and runoff prediction has always been a hot topic in the field of hydrology (Zhao et al. 2017; Zhang et al. 2018). Common runoff prediction models include the Autoregressive Moving Average model (Ab Razak et al. 2018; Wang et al. 2019), Artificial Neural Network model (Thirumalaiah & Deo 2000; Meng et al. 2015; Piotrowski et al. 2016) and Support Vector Regression model (Hong et al. 2012; Gizaw & Gan 2016), etc. However, these models are limited to single factor analysis, and it is difficult to achieve the desired prediction results. With the development of computer technology, many researchers use soft computing techniques to study rainfall and runoff with high accuracy (Aytek & Alp 2008; Rezaie-Balf et al. 2017; Guru & Jha 2019). But, influenced by many uncertain factors such as human activities, climate and underlying surface change (Zhao et al. 2009; Wang et al. 2013; Shen et al. 2017), most hydrological time series are non-stationary time series which will lead to spurious regression (Lee & Yu 2009; Zhang & Sun 2015; Jin et al. 2017). Therefore, the simulation and prediction of hydrological elements based on this spurious regression will undoubtedly distort the calculation results.

Co-integration theory proposed by Engle & Granger (1987) can deal with the non-stationarity of time series. The main idea of co-integration theory is that if the linear combination of these non-stationary time series is stationary, then these time series have a co-integration relationship (Engle & Granger 1987). The greatest advantage of co-integration theory is that it can deal with the problem of the non-stationarity of variables, and it can also reveal the long-term balance relationships and short-term fluctuations between variables. Therefore, co-integration theory has been developed in the hydrological field in recent years (Cole 2004; Seung-Hoon 2007). Some scholars have used co-integration theory to predict river runoff and have achieved some important results (Zhang et al. 2006, 2013, 2017b). In addition, in order to improve prediction accuracy, the combination of other analysis methods and the co-integration method has also achieved good results (Zhang et al. 2015, 2017a).

However, co-integration theory requires that the structure of the time series is stable, that is, there are no breakpoints, otherwise the possibility of unit root statistics migration will occur. In fact, climate changes, human activities and underlying surface changes may lead to sudden structure changes in the hydrological variables time series, which means the basis of a co-integration test no longer exists. Therefore, structural changes must be considered to grasp the co-integration relationship between hydrological variables more accurately. Structure change co-integration theory is an effective method to study a time series with structural breakpoints (Campos et al. 1996; Ploberger & Krämer 1996; Johansen et al. 2000). At present, there is little literature on research on structure change co-integration relationships among hydrological variables, except that Guo et al. (2018) discuss the change mechanism of water supply and demand in irrigation area with structure change co-integration theory.

The objective of this paper is to analyze the relationship between rainfall and runoff in the source area of the Yellow River using co-integration theory, and then establish the error correction model to predict runoff. Secondly, the endogenous breakpoint test method is used to determine the breakpoints of the runoff series. Finally, structure change co-integration theory is used to analyze the long-run relationship between rainfall and runoff time series, and the structure change co-integration error correction model is established to predict runoff. The novelty of this paper is that the error correction model between rainfall and runoff with structure change co-integration theory in the source area of the Yellow River is established, with which not only is runoff prediction improved, but also the profound influence of human activities and climate change on the relationship between rainfall and runoff is revealed.

RESEARCH METHODS

Co-integration theory

Co-integration describes the long-term balance relationships between time series. If a time series is not stationary and becomes stationary after a difference, it is integrated of order one, denoted by I(1). If a time series becomes stationary after d-order difference, it is integrated of order d, which is denoted by I(d). The time series itself being stationary is denoted by I(0). If two time series Xt and Yt are I(d), that is XtI(d) and YtI(d). If the existence of β makes Yt–βXt become an I(0) process, it is said that Xt and Yt have a co-integration relationship.

Structure change co-integration method

According to structural stability, structure change co-integration can be divided into three types: co-integration with coefficients shift, partial co-integration and co-integration with mechanistic change (Yang & Zhang 2002).

Definition: Suppose an n-dimensional time series , , is a sequential set, if there exists a subset , , , , , , , is the empty set.

  • (1)

    When , ; , , Xt is called co-integration with coefficients shift.

  • (2)

    When , ; , , Xt is called partial co-integration.

  • (3)

    When , ; , s-dimensional time series , , , Xt is called co-integration with mechanistic change.

Co-integration with coefficients shift means that the co-integration coefficients have changed at some points, but the co-integration relationship still exists, which is an abrupt structural change. Partial co-integration is a co-integration relationship that exists before and after certain points in the time series, while co-integration in other time series does not exist. Co-integration with mechanistic change means that the balance of the original system is destroyed and a new balance is formed due to the addition of new variables.

The hydrological data has the characteristics of co-integration with coefficients shift in practical application, so it is very suitable for runoff prediction. Co-integration with coefficients shift can be divided into three forms: level shift, level shift with trend, and state shift (Gregory & Hansen 1996).

Suppose a standard co-integration regression model is , ; when , there is a co-integration relationship between yt and xt. In order to construct a structure change co-integration model, virtual variables are introduced, , , where is the time point of a break and n is the number of breakpoints. Therefore, the forms of structural break in co-integration with the coefficients shift model are as follows.

① Level shift: 
formula
(1)
where represents a constant term before a shift occurs, and represents the variation of shift.
② Level shift with trend: 
formula
(2)
where represents the coefficients before the time trend term.
③ State shift: 
formula
(3)
where this form has not only a constant term shift, but also trend term shift and slope change.

Breakpoint test

The time series breakpoint test can be divided into exogenous structure breakpoint estimation and endogenous structure breakpoint test. The former method estimates the breakpoint artificially based on experience and historical events, which is more subjective and has less consideration for lagging. The latter method tests the breakpoint by data-mining technology, which is more objective. Therefore, this paper uses the unit root test methods such as the recursive test, rolling test and sequential test to examine whether there are structural breakpoints in the sequence (Banerjee et al. 1992).

Recursive test: select the first sub-sample (usually one-quarter of the original sample size), then the range of sub-samples is expanded year by year. Each sub-sample is tested by Augmented Dickey–Fuller (ADF) test with intercept term and trend term. The ADF unit root test is the usual method to test the stationarity of time series, proposed by Dickey & Fuller (1979). Then according to the time series picture of ADF value it is judged whether there exist some points less than the critical value. If a certain ADF value is less than the critical value, it indicates that the original sequence is a trend stationary process with structural mutation, indicating that the original sequence has structural breaks there. The test formula is as follows: 
formula
(4)

Rolling test: select the sub-sample (usually one-third of the original sample size), and keep the sub-samples unchanged, then carry out the ADF test with intercept and trend items for each sub-sample. Finally, the ADF value is compared with the critical value to determine the structural breakpoint.

Sequential test: select the test range of structural breaks as k = [0.15 T, 0.85 T], and T is the sample size. Within this range, virtual variables are used to change the year in which hypothetical structural breaks occur, and examine the possibility of breaks during this period. Then the minimum value is selected from the ADF value sequence obtained from the test, and compared with the corresponding critical value, the unit root hypothesis is tested and the structural breakpoint is determined. The test formula is as follows: 
formula
(5)

In this formula Dt is defined in two cases:

Case 1: mean breaks model, ;

Case 2: trend breaks model, , (k is the year of break).

Co-integration test

Prior to testing the co-integration relationship among variables, the stationarity of the data is checked. The ADF unit root test is usually used as follows (Dickey & Fuller 1979): 
formula
(6)
where Δyt is the first-order difference of the variable yt; α, β, δ, ζi are all parameters; t is time, p is lag order, is white-noise sequence.

If the two time series are stationary or the difference of the same order is stationary, there may be a co-integration relationship. The most commonly used method to test the co-integration relationship of two time series is the EG two-step method. The specific steps are as follows (Engle & Granger 1987).

Step 1: the ordinary least squares method (OLS) is used to carry out static regression for two time series with stationarity or equal-order differential stationarity, and two time series co-integration equations and residual series are obtained.

Step 2: the ADF unit root test is applied to the residual series of the co-integration equation. If the residual series is stationary, the co-integration relationship exists between the two time series.

Error correction model (ECM)

According to co-integration theory, if there is a co-integration relationship between time series, an error correction model (ECM) can be established (Zivot & Wang 2006). ECM expresses a long-term balance relationship between variables, which may fluctuate or deviate from this balance relationship in the short term. Therefore, in order to reduce the model error and improve prediction accuracy, it is necessary to incorporate the residual of the regression equation into the model as a non-balance error term. The model of the long-term balance and the short-term fluctuation characteristics of the response variables is called the ECM, and its formula is as follows: 
formula
(7)
where λ is the coefficients of the difference terms of the variables, and reflects the short-term dynamic changes of the model; ecmt−1 is the error correction term, that is, the lag first order of the residual sequence of the regression equation, and it reflects the extent to which the first term of the dependent variable deviates from the long-term balance in short-term fluctuations; φ is the correction coefficient, also known as the adjustment speed, and is usually negative; c is the constant term of the model, ɛt is the white-noise sequence.

RESEARCH STEPS

The research steps of this paper are that based on the time series data of rainfall and runoff, with co-integration theory and structure change co-integration theory, two types of ECMs are constructed respectively. Suppose the ECM with co-integration theory is CECM and the ECM with structure change co-integration theory is SCECM. With these two ECMs, the measured sample data are predicted. By comparing the accuracy of CECM with that of SCECM, a conclusion can be drawn. The flow chart is shown in Figure 1.

Figure 1

Flow chart of research steps.

Figure 1

Flow chart of research steps.

RESULTS AND DISCUSSIONS

Data sources

The source area of the Yellow River is located in the northeastern part of the Qinghai-Tibet Plateau, and the geographic coordinates are 95°50′ ∼ 103°30′ E and 32°10′ ∼ 36°05′ N (as shown in Figure 2). It refers to the area above the Tangnaihai Hydrological Station in the main stream of the Yellow River which belongs to the alpine and semi-humid climate area. The area of the basin is 122,000 km2, and it accounts for 13% of the catchment area of the Yellow River Basin and contributes 33% of the annual runoff of the Yellow River. It is the most important runoff producing area in the Yellow River Basin. The change of runoff in this region has a vital influence and controlling role on the change of water resources in the whole Yellow River Basin.

Figure 2

The location map of the source area of the Yellow River.

Figure 2

The location map of the source area of the Yellow River.

This paper collected the rainfall and runoff data series of Tangnaihai Hydrological Station in the source area of the Yellow River from 1966 to 2014, as shown in Figure 3. Among them, 44 years of data from 1966 to 2009 are selected to analyze and construct the model, and five years of data from 2010 to 2014 are used to carry out prediction.

Figure 3

Time series of rainfall and runoff at Tangnaihai Station in the source area of the Yellow River.

Figure 3

Time series of rainfall and runoff at Tangnaihai Station in the source area of the Yellow River.

As shown in Figure 3, the rainfall and runoff time series at Tangnaihai Station in the source area of the Yellow River from 1966 to 2014 have a similar trend and the statistical correlation coefficient is 0.832, which means they have a good correlation. Table 1 shows the statistical characteristics of the rainfall and runoff time series. The mean value is 557.22 mm for the rainfall time series, and 203.76 billion m3 for the runoff time series Although the standard deviation of rainfall is a little more than that of runoff, for coefficient of variation and skewness coefficient the values are both larger than those of rainfall.

Table 1

The statistical parameters of rainfall and runoff time series

Time seriesMeanStandard deviationCoefficient of variationSkewness coefficient
Rainfall 557.225 58.064 0.104 0.173 
Runoff 203.760 54.437 0.267 0.672 
Time seriesMeanStandard deviationCoefficient of variationSkewness coefficient
Rainfall 557.225 58.064 0.104 0.173 
Runoff 203.760 54.437 0.267 0.672 

Stationarity test

The ADF unit root test is used to test the stationarity of time series (Dickey & Fuller 1979). Suppose rainfall is expressed as P and runoff as W. The test results are shown in Table 2.

Table 2

Unit root test results of rainfall and runoff time series

VariableADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
P 0.06367 −2.62406 −1.94932 −1.61171 0.697 no 
ΔP −5.01584 −2.62724 −1.94986 −1.61147 0.000 yes 
W −0.18467 −2.62724 −1.94986 −1.61147 0.613 no 
ΔW −5.40633 −2.62724 −1.94986 −1.61147 0.000 yes 
VariableADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
P 0.06367 −2.62406 −1.94932 −1.61171 0.697 no 
ΔP −5.01584 −2.62724 −1.94986 −1.61147 0.000 yes 
W −0.18467 −2.62724 −1.94986 −1.61147 0.613 no 
ΔW −5.40633 −2.62724 −1.94986 −1.61147 0.000 yes 

Note: Δ denotes first-order difference, t-statistic refers to the significance test of variables; Prob is the lowest confidence-level required to reject the original hypothesis.

It can be seen from Table 1 that the ADF test values of the original time series of rainfall and runoff are larger than the critical values of 1%, 5% and 10% at the significance level. They are non-stationary. After first-order difference, the ADF test values of rainfall and runoff are less than 1% of the critical values at the significance level. They are stationary. Therefore, both rainfall and runoff are first-order difference stationary, that is, PI(1), WI(1).

Construction of model without considering structural change

Co-integration test

The EG two-step method is used to test the co-integration relationship of the two variables.

Step 1: with the OLS method, the co-integration regression equation of the two time series is obtained as follows: 
formula
(8)
 
formula
where R2 is the determination coefficient with the range of values of 0 to 1, such that the closer the R2 value is to 1, the better the reliability of the model is. DW is the Durbin–Watson statistic, which indicates whether there is autocorrelation in the sequence of equation residuals. If the value of DW is close to 2, there is almost no autocorrelation and the accuracy of the model is good; is a white-noise sequence.

Step 2: the residual sequence is tested by the ADF test method as shown in Table 3.

From Table 3, it can be seen that the residual sequence is stationary by the ADF test. Therefore, there is a co-integration relationship between the time series of P and W, which indicates that there is a long-term balance relationship between the rainfall and runoff time series.

Table 3

ADF test results for residual sequences

Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) −5.26083 −2.61985 −1.94869 −1.61204 0.000 yes 
(c,0) −5.20544 −3.59246 −2.93140 −2.60394 0.000 yes 
(c,t−4.95034 −4.19234 −3.52079 −3.19128 0.001 yes 
Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) −5.26083 −2.61985 −1.94869 −1.61204 0.000 yes 
(c,0) −5.20544 −3.59246 −2.93140 −2.60394 0.000 yes 
(c,t−4.95034 −4.19234 −3.52079 −3.19128 0.001 yes 

Note: In test type (c,t), c denotes having a constant term and t denotes having a trend term (the other meanings are the same).

Construction of CECM

The CECM of P and W is constructed as follows: 
formula
(9)
 
formula
where ecmt−1 is the error correction term, the coefficient of ecmt−1 is the short-term adjustment coefficient, and the coefficient of represents the short-term dynamic change.

Equation (9) shows that runoff is affected by rainfall and its deviation from the balance level in the previous year. The coefficient before ΔP is 0.6709, which shows that rainfall has a significant effect on runoff. The coefficient of ecmt−1 is −0.7486, which confirms the reverse correction mechanism. It shows that the deviation of runoff from balance this year will be adjusted by 77.86% in the next year. R2 is 0.837008, so the goodness-of-fit of the model is good and the explanatory value is strong. DW is 1.801638, which shows that there is no autocorrelation in the residual sequence.

Runoff prediction with CECM

The CECM was used to simulate the annual runoff in the source area of the Yellow River from 1966 to 2009, and the annual runoff from 2010 to 2014 was predicted for verification. The fitting results are shown in Figure 4 below.

Figure 4

The fitting results of the CECM.

Figure 4

The fitting results of the CECM.

Figure 4 shows that the CECM has a good fitting. Within it, 2010–2014 is the prediction period in which the relative errors are less than 20%, the overall mean absolute percentage error (MAPE) of the model is 10.45%, and the MAPE of the prediction period is 11.04%.

Construction of model considering structural change

Breakpoint test

Using the recursive test, the rolling test and the sequential test method, the runoff series of Tangnaihai Station in the source area of the Yellow River from 1966 to 2009 are tested. The test results are shown in Figure 5.

Figure 5

The breakpoint test results. Note: The horizontal dashed line is the critical value of the 5% significance level. The critical value is approximated by the critical interpolation at T = 100 and T = 250 in Tables 1 and 2 in Banerjee et al. (1992).

Figure 5

The breakpoint test results. Note: The horizontal dashed line is the critical value of the 5% significance level. The critical value is approximated by the critical interpolation at T = 100 and T = 250 in Tables 1 and 2 in Banerjee et al. (1992).

The final test results are shown in Table 4.

Table 4

Results of recursive test, rolling test and sequential test

VariableRecursive testRolling testSequential test of mean breaksSequential test of trend breaksYear of structural breaks
W 2002 none 1989 none 1989, 2002 
VariableRecursive testRolling testSequential test of mean breaksSequential test of trend breaksYear of structural breaks
W 2002 none 1989 none 1989, 2002 

From the test results, only the recursive test and sequential test show breakpoints exist in 2002 and 1989, respectively. It is found that 1989 was the year with the largest runoff and the largest rainfall and 2002 was the year with the lowest runoff and the lowest rainfall.

By investigation, the Longyang Gorges Reservoir in the source area of the Yellow River was put into operation in 1987 (Li & Yang 2004). The Longyang Gorges Reservoir is a large reservoir with multi-year regulation, and the total storage capacity is 24.7 billion m3 and the regulating storage capacity is 19.36 billion m3. The construction and operation of the reservoir will have a significant influence on the river runoff. This influence lags behind the construction and operation of water conservancy projects. Therefore, the breakpoint in 1989 is reasonable.

Actually, 2002 was a poor year in the source area of the Yellow River (Tang et al. 2004). According to many studies on the Yellow River Basin, climate change is shown by changes in rainfall and the change of rainfall was the main reason for the change of runoff (Wang et al. 2013). Less rainfall makes runoff decrease sharply in poor years, resulting in structural changes, and thus a breakpoint appeared in 2002.

According to the above analysis, the breakpoints of the runoff time series can be determined as 1989 and 2002, so the structure change co-integration between rainfall and runoff can be constructed.

Construction of structure change co-integration model

Three types of structure change co-integration models are considered, including level shift, level shift with trend and state shift. Introducing virtual variables as follows, 
formula
the level shift, level shift with trend and state shift structure change co-integration equations are constructed respectively.
Equation (1): 
formula
(10)
 
formula
Equation (2): 
formula
(11)
 
formula
Equation (3): 
formula
(12)
 
formula

The structure change co-integration equation with breakpoints is introduced to illustrate that in the long run, the overall goodness-of-fit R2 values of Equations (1)–(3) are 0.7767, 0.7767 and 0.7879 respectively, which are higher than that of the co-integration equation without breakpoints.

Equation (3) shows that the coefficient of elasticity between W and P from 1966 to 1989 is 0.8126, which shows that P has a great influence on W. That is, for each 1% increase in P, the synchronous increase in W is 81.26%. From 1989 to 2002, the coefficient of elasticity between W and P is 0.5893, which is 0.2233 lower than that before, and the effect of P on W is reduced. That is, for every 1% increase in P, the synchronous increase in W is 58.93%. After 2002, the coefficient of elasticity between W and P is 0.9866, which is 0.3973 higher than that before. The influence of P on W increases. That is, for each 1% increase in P, the synchronous increase in W is 98.66%.

These results show that there is a positive correlation between runoff and rainfall in the source area of the Yellow River. After the first break in 1989, the impact of rainfall on runoff decreased, and after the second break in 2002, the impact of rainfall on runoff increased.

Structure change co-integration test

The ADF test method is used to test the residual sequence of the above-mentioned structure change co-integration Equations (1)–(3). The results are shown in Tables 57.

Table 5

Unit root test results for residual sequences of Equation (1)

Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.29770 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.23107 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.14066 −4.19234 −3.52079 −3.19128 0.000 yes 
Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.29770 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.23107 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.14066 −4.19234 −3.52079 −3.19128 0.000 yes 
Table 6

Unit root test results for residual sequences of Equation (2)

Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.29796 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.23092 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.14208 −4.19234 −3.52079 −3.19128 0.000 yes 
Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.29796 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.23092 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.14208 −4.19234 −3.52079 −3.19128 0.000 yes 
Table 7

Unit root test results for residual sequences of Equation (3)

Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.21142 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.14203 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.06006 −4.19234 −3.52079 −3.19128 0.000 yes 
Test type (c,t)ADF test valueCritical values of t-statistic
ProbStationary
1%5%10%
(0,0) 5.21142 −2.62119 −1.94889 −1.61193 0.000 yes 
(c,0) 5.14203 −3.59662 −2.93316 −2.60487 0.000 yes 
(c,t5.06006 −4.19234 −3.52079 −3.19128 0.000 yes 

From Tables 57, we can see that the residual sequences of the three equations are stationary by the unit root test. So the structure change co-integration relationship exists.

Construction of SCECM

The residual sequences of the above three structure change co-integration equations are stationary. Therefore, the SCECM can be constructed to understand the short-term dynamic relationships between rainfall and runoff, and thus the prediction accuracy is improved. In order to compare the SCECM and CECM, the SCECMs constructed by Equations (1)–(3) above are named model 1, model 2 and model 3 respectively. The structure change ECMs are constructed as follows.

Model 1: 
formula
(13)
 
formula
Model 2: 
formula
(14)
 
formula
Model 3: 
formula
(15)
 
formula

The negative coefficients of ecmt−1 in models 1, 2 and 3 confirm these models are in accord with the error correction mechanism. The R2 values are all greater than 0.8, which make the model more explanatory. The DW values are all close to 2, and there is no autocorrelation in the residual sequences. Therefore, this can reflect the long-term balance and short-term fluctuation relationships between rainfall and runoff in the source area of the Yellow River after introducing the breakpoints.

Runoff prediction with SCECM

The structure change models 1, 2 and 3 are used to simulate the annual runoff in the source area of the Yellow River from 1966 to 2009, and the annual runoff from 2010 to 2014 was predicted for verification. The fitting results are shown in Figures 68.

Figure 6

The fitting result of model 1.

Figure 6

The fitting result of model 1.

Figure 7

The fitting result of model 2.

Figure 7

The fitting result of model 2.

Figure 8

The fitting result of model 3.

Figure 8

The fitting result of model 3.

From Figures 68, it can be seen that the fitting result of the SCECM is better. Within it, 2010–2014 is the prediction period. For model 1, the relative errors in the prediction period are less than 20% except for the error of 20.28% in 2012, and the overall MAPE of model 1 is 9.58%. For model 2, the error in the prediction period is less than 20% except for the error of 20.33% in 2012, and the overall MAPE of model 2 is 9.58%. For model 3, the prediction period errors are all less than 20%, and the overall MAPE of model 3 is 10.06%. The MAPEs of models 1, 2 and 3 are all less than the CECM by 10.45%. Therefore, the fitting of SCECM is better than that of CECM.

Model comparison

In order to compare the models, the mean square error (MSE) of each model is calculated as in Table 8. It can be seen from Table 8 that the MSEs of the SCECMs are all smaller than that of the CECM, while for the SCECMs, the MSE of the models is smallest. This means model 3 is more available among these models.

Table 8

The performance evaluation parameters of the models

TypeCECMSCECM
Model 1Model 2Model 3
MSE 26.4712 25.2139 24.6478 23.8456 
TypeCECMSCECM
Model 1Model 2Model 3
MSE 26.4712 25.2139 24.6478 23.8456 

Moreover, the prediction errors of models 1, 2, 3 and CECM are compared. Table 9 shows the comparison of the relative errors of the SCECM and CECM.

Table 9

The relative errors of models in the prediction period

YearRelative error of CECM (%)Relative error of model 1 (%)Relative error of model 2 (%)Relative error of model 3 (%)
2010 11.42 5.13 5.19 0.78 
2011 18.10 3.36 3.30 2.65 
2012 10.12 20.28 20.33 17.82 
2013 1.01 14.03 14.12 9.61 
2014 14.54 0.91 1.03 4.57 
Mean 11.04 8.74 8.79 7.08 
YearRelative error of CECM (%)Relative error of model 1 (%)Relative error of model 2 (%)Relative error of model 3 (%)
2010 11.42 5.13 5.19 0.78 
2011 18.10 3.36 3.30 2.65 
2012 10.12 20.28 20.33 17.82 
2013 1.01 14.03 14.12 9.61 
2014 14.54 0.91 1.03 4.57 
Mean 11.04 8.74 8.79 7.08 

Table 9 shows that the average relative errors of the SCECM with breakpoints are all less than that of the CECM. The average relative error of model 3 is the smallest at 7.08%, which is 3.96% less than that of the CECM. It shows that state-switched structure change co-integration model 3 has higher prediction accuracy.

Certainly, some research on runoff prediction with co-integration theory have also been carried out to predict runoff and achieved good results (Chang & Liu 2005; Zhang et al. 2006, 2013), but the innovation of our research results is to introduce structure change co-integration theory, so the prediction accuracy is better than theirs.

CONCLUSIONS

  • (1)

    Using co-integration theory, the co-integration relationship between rainfall and runoff in the source area of the Yellow River is revealed. By constructing the CECM the long-term balance and short-term fluctuation relationships between rainfall and runoff are revealed, and it shows that runoff is affected by rainfall and the deviation of runoff from balance this year will be adjusted by 77.86% in the next year. The runoff in the source area of the Yellow River is predicted with the MAPE in the prediction period being 11.04%.

  • (2)

    Using the endogenous structure breakpoint test method the breakpoints of the runoff time series are determined, and its breakpoints are tested in 1989 and 2002. The test results reflect the profound impact of typical human activities, reservoir construction and climate change, on runoff in source areas of the Yellow River, and also the breakpoint in 1989 is mainly effected by reservoir construction while in 2002 it is effected by rainfall changes.

  • (3)

    Using structure change co-integration theory, the change of the relationship between rainfall and runoff before and after the breakpoints is revealed. It shows that from the first break in 1989, the impact of rainfall on runoff decreased, but from 2002 it increased. By constructing the SCECM the runoff in the source area of the Yellow River is predicted more accurately, the prediction MAPE of the state-switched SCECM is 7.08%, which is 3.96% less than that of the CECM, and the prediction accuracy has been greatly improved.

Therefore, it can provide technical reference for researchers in the hydrology field.

ACKNOWLEDGEMENTS

This research is supported by the National Key R&D Program of China (Grant No. 2018YFC0406501), Program for Innovative Talents (in Science and Technology) at University of Henan Province (Grant No. 18HASTIT014), and Foundation for University Youth Key Teacher of Henan Province (Grant No. 2017GGJS006).

REFERENCES

REFERENCES
Ab Razak
N. H.
,
Aris
A. Z.
,
Ramli
M. F.
,
Looi
L. J.
&
Juahir
H.
2018
Temporal flood incidence forecasting for Segamat River (Malaysia) using autoregressive integrated moving average modelling
.
Journal of Flood Risk Management
11
(
S2
),
S794
S804
.
Aytek
A.
&
Alp
M.
2008
An application of artificial intelligence for rainfall–runoff modeling
.
Journal of Earth System Science
117
,
145
155
.
Banerjee
A.
,
Lumsdaine
R. L.
&
Stock
J. H.
1992
Recursive and sequential tests of the unit-root and trend-break hypotheses: theory and international evidence
.
Journal of Business & Economic Statistics
10
(
3
),
271
287
.
Campos
J.
,
Ericsson
N. R.
&
Hendry
D. F.
1996
Cointegration tests in the presence of structural breaks
.
Journal of Econometrics
70
(
1
),
187
220
.
Chang
M. Q.
&
Liu
J. P.
2005
A study on the cointegration forecast of runoff
.
Journal of Applied Sciences
23
(
6
),
654
657
.
Cole
M. A.
2004
Economic growth and water use
.
Applied Economics Letters
11
,
1
4
.
Dickey
D. A.
&
Fuller
W. A.
1979
Distribution of the estimators for autoregressive time series with a unit root
.
Journal of the American Statistical Association
74
,
427
431
.
Engle
R. F.
&
Granger
C. W. J.
1987
Co-integration and error correction: representation, estimation and testing
.
Econometrica
55
(
2
),
251
276
.
Gregory
A. W.
&
Hansen
B. E.
1996
Residual-based tests for cointegration in models with regime shifts
.
Journal of Econometrics
70
,
99
126
.
Guo
B. T.
,
Sun
S. Y.
,
Zhang
J. P.
&
Li
J. Y.
2018
Study on the relationship of variable structure cointegration of water supply and demand in Luhun irrigation district
.
Water Saving Irrigation
23
(
10
),
68
73
+ 77
.
Guru
N.
&
Jha
R.
2019
Application of soft computing techniques for river flow prediction in the downstream catchment of Mahanadi River Basin using partial duration series, India
.
Iranian Journal of Science and Technology, Transactions of Civil Engineering
44
,
279
297
.
Hong
J. H.
,
Goyal
M. K.
,
Chiew
Y. M.
&
Chua
L. H. C.
2012
Predicting time-dependent pier scour depth with support vector regression
.
Journal of Hydrology
468–469
,
241
248
.
Jin
H.
,
Zhang
S.
&
Zhang
J. S.
2017
Spurious regression due to neglected of non-stationary volatility
.
Computation Statistics
32
(
3
),
1065
1081
.
Johansen
S.
,
Mosconi
R.
&
Nielsen
B.
2000
Cointegration analysis in the presence of structural breaks in the deterministic trend
.
Econometrics Journal
3
(
2
),
216
249
.
Li
C. H.
&
Yang
Z. F.
2004
Influence of operation of main reservoirs on the Yellow River on runoff
.
Yellow River
26
(
7
),
15
16
+ 46
.
Meng
X. M.
,
Yin
M. S.
,
Ning
L. B.
,
Liu
D. F.
&
Xue
X. W.
2015
A threshold artificial neural network model for improving runoff prediction in a karst watershed
.
Environmental Earth Sciences
74
(
6
),
5039
5048
.
Piotrowski
A. P.
,
Napiorkowski
J. J.
,
Osuch
M.
&
Napiorkowski
M. J.
2016
On the importance of training methods and ensemble aggregation for runoff prediction by means of artificial neural networks
.
Hydrological Sciences Journal
61
(
10
),
1903
1925
.
Ploberger
W.
&
Krämer
W.
1996
A trend-resistant test for structural change based on OLS residuals
.
Journal of Econometrics
70
(
1
),
175
185
.
Rezaie-Balf
M.
,
Zahmatkesh
Z.
&
Kim
S.
2017
Soft computing techniques for rainfall-runoff simulation: local non-parametric paradigm vs. model classification methods
.
Water Resources Management
31
,
3843
3865
.
Seung-Hoon
Y.
2007
Urban water consumption and regional economic growth: the case of Taejeon, Korea
.
Water Resources Management
21
,
1353
1361
.
Tang
H. Y.
,
Tang
M. C.
&
Zhao
Y. N.
2004
Underground causes of water-flow variation of the Longyang Gorges reservoir in Yellow River and its prediction method
.
Plateau Meteorology
23
(
4
),
472
475
.
Thirumalaiah
K.
&
Deo
M. C.
2000
Hydrological forecasting using neural networks
.
Journal of Hydrologic Engineering
5
(
2
),
180
189
.
Wang
Y.
,
Ding
Y. J.
,
Ye
B. S.
,
Liu
F. J.
,
Wang
J.
&
Wang
J.
2013
Contributions of climate and human activities to changes in runoff of the Yellow and Yangtze rivers from 1950 to 2008
.
Science China Earth Sciences
56
(
8
),
1398
1412
.
Yang
B. C.
&
Zhang
S. Y.
2002
Study on cointegration with structural changes
.
Journal of Systems Engineering
17
(
1
),
26
31
.
Zhang
S.
&
Sun
R.
2015
The spurious regression of fractionally integrated processes with change points
.
International Business and Management
11
(
2
),
69
73
.
Zhang
L. Y.
,
Zhang
L. P.
,
Cao
F. L.
&
Song
X. Y.
2006
Annual runoff forecasting research based on the theory of cointegration and error correction model
.
Engineering Journal of Wuhan University
39
(
6
),
6
9
.
Zhang
J. P.
,
Yuan
W. L.
&
Guo
B. T.
2013
Study on prediction of stream flow based on cointegration theory
.
International Journal Hydroelectric Energy
31
(
5
),
18
20
.
Zhang
J. P.
,
Zhao
Y.
&
Xiao
W. H.
2015
Multi-resolution cointegraton prediction for runoff and sediment load
.
Water Resources Management
29
,
3601
3613
.
Zhang
J. P.
,
Li
Y. Y.
,
Zhao
Y.
&
Hong
Y.
2017a
Wavelet–cointegration prediction of irrigation water in the irrigation district
.
Journal of Hydrology
544
,
343
351
.
Zhang
J. P.
,
Zhao
Y.
&
Lin
X. M.
2017b
Uncertainty analysis and prediction of river runoff with multi-time scales
.
Water Science and Technology: Water Supply
17
(
3
),
897
906
.
Zhang
Y. Q.
,
Chiew
F. H. S.
,
Li
M.
&
Post
D.
2018
Predicting runoff signatures using regression and hydrological modeling approaches
.
Water Resources Research
54
(
10
),
7859
7878
.
Zhao
F. F.
,
Xu
Z. X.
,
Zhang
L.
&
Zuo
D. P.
2009
Streamflow response to climate variability and human activities in the upper catchment of the Yellow River Basin
.
Science in China Series E: Technological Sciences
52
,
3249
.
Zhao
X. H.
,
Chen
X.
,
Xu
Y. X.
,
Xi
D. J.
,
Zhang
Y. B.
&
Zheng
X. Q.
2017
An EMD-based chaotic least squares support vector machine hybrid model for annual runoff forecasting
.
Water
9
(
3
),
153
.
Zivot
E.
&
Wang
J. H.
2006
Modeling Financial Time Series with S-PLUS
.
Springer
,
New York, USA
.