## Abstract

Aiming at revealing the co-integration under structural change in the long-run relationship between rainfall and runoff time series at Tangnaihai Hydrological Station in the source area of the Yellow River, and improving the accuracy of annual runoff prediction, co-integration theory and structure change co-integration theory are introduced respectively. The error correction models of rainfall and runoff in these two cases are constructed. The results show that reservoir construction and climate change can cause structure change in the long-run relationship between rainfall and runoff in the source area of the Yellow River. The breakpoints appeared in 1989 and 2002, in which the breakpoint in 1989 is mainly effected by reservoir construction while in 2002 it is effected by rainfall changes. Meanwhile, the error correction model with structural change shows that the impact of rainfall on runoff decreases from 1989 but increases from 2002. Finally, for the prediction of runoff in the next five years, the mean absolute percentage errors of the prediction models without and with breakpoints are 11.04% and 7.08% respectively, and this shows that the error correction model with structural change has the higher runoff prediction accuracy.

## INTRODUCTION

Runoff is an important hydrological variable, and runoff prediction has always been a hot topic in the field of hydrology (Zhao *et al.* 2017; Zhang *et al.* 2018). Common runoff prediction models include the Autoregressive Moving Average model (Ab Razak *et al.* 2018; Wang *et al.* 2019), Artificial Neural Network model (Thirumalaiah & Deo 2000; Meng *et al.* 2015; Piotrowski *et al.* 2016) and Support Vector Regression model (Hong *et al.* 2012; Gizaw & Gan 2016), etc. However, these models are limited to single factor analysis, and it is difficult to achieve the desired prediction results. With the development of computer technology, many researchers use soft computing techniques to study rainfall and runoff with high accuracy (Aytek & Alp 2008; Rezaie-Balf *et al.* 2017; Guru & Jha 2019). But, influenced by many uncertain factors such as human activities, climate and underlying surface change (Zhao *et al.* 2009; Wang *et al.* 2013; Shen *et al.* 2017), most hydrological time series are non-stationary time series which will lead to spurious regression (Lee & Yu 2009; Zhang & Sun 2015; Jin *et al.* 2017). Therefore, the simulation and prediction of hydrological elements based on this spurious regression will undoubtedly distort the calculation results.

Co-integration theory proposed by Engle & Granger (1987) can deal with the non-stationarity of time series. The main idea of co-integration theory is that if the linear combination of these non-stationary time series is stationary, then these time series have a co-integration relationship (Engle & Granger 1987). The greatest advantage of co-integration theory is that it can deal with the problem of the non-stationarity of variables, and it can also reveal the long-term balance relationships and short-term fluctuations between variables. Therefore, co-integration theory has been developed in the hydrological field in recent years (Cole 2004; Seung-Hoon 2007). Some scholars have used co-integration theory to predict river runoff and have achieved some important results (Zhang *et al.* 2006, 2013, 2017b). In addition, in order to improve prediction accuracy, the combination of other analysis methods and the co-integration method has also achieved good results (Zhang *et al.* 2015, 2017a).

However, co-integration theory requires that the structure of the time series is stable, that is, there are no breakpoints, otherwise the possibility of unit root statistics migration will occur. In fact, climate changes, human activities and underlying surface changes may lead to sudden structure changes in the hydrological variables time series, which means the basis of a co-integration test no longer exists. Therefore, structural changes must be considered to grasp the co-integration relationship between hydrological variables more accurately. Structure change co-integration theory is an effective method to study a time series with structural breakpoints (Campos *et al.* 1996; Ploberger & Krämer 1996; Johansen *et al.* 2000). At present, there is little literature on research on structure change co-integration relationships among hydrological variables, except that Guo *et al.* (2018) discuss the change mechanism of water supply and demand in irrigation area with structure change co-integration theory.

The objective of this paper is to analyze the relationship between rainfall and runoff in the source area of the Yellow River using co-integration theory, and then establish the error correction model to predict runoff. Secondly, the endogenous breakpoint test method is used to determine the breakpoints of the runoff series. Finally, structure change co-integration theory is used to analyze the long-run relationship between rainfall and runoff time series, and the structure change co-integration error correction model is established to predict runoff. The novelty of this paper is that the error correction model between rainfall and runoff with structure change co-integration theory in the source area of the Yellow River is established, with which not only is runoff prediction improved, but also the profound influence of human activities and climate change on the relationship between rainfall and runoff is revealed.

## RESEARCH METHODS

### Co-integration theory

Co-integration describes the long-term balance relationships between time series. If a time series is not stationary and becomes stationary after a difference, it is integrated of order one, denoted by *I*(1). If a time series becomes stationary after *d*-order difference, it is integrated of order *d*, which is denoted by *I*(*d*). The time series itself being stationary is denoted by *I*(0). If two time series *X _{t}* and

*Y*are

_{t}*I*(

*d*), that is

*X*∼

_{t}*I*(

*d*) and

*Y*∼

_{t}*I*(

*d*). If the existence of

*β*makes

*Y*become an

_{t}–βX_{t}*I*(0) process, it is said that

*X*and

_{t}*Y*have a co-integration relationship.

_{t}### Structure change co-integration method

According to structural stability, structure change co-integration can be divided into three types: co-integration with coefficients shift, partial co-integration and co-integration with mechanistic change (Yang & Zhang 2002).

Definition: Suppose an *n*-dimensional time series , , is a sequential set, if there exists a subset , , , , , , , is the empty set.

- (1)
When , ; , ,

*X*is called co-integration with coefficients shift._{t} - (2)
When , ; , ,

*X*is called partial co-integration._{t} - (3)
When , ; ,

*s*-dimensional time series , , ,*X*is called co-integration with mechanistic change._{t}

Co-integration with coefficients shift means that the co-integration coefficients have changed at some points, but the co-integration relationship still exists, which is an abrupt structural change. Partial co-integration is a co-integration relationship that exists before and after certain points in the time series, while co-integration in other time series does not exist. Co-integration with mechanistic change means that the balance of the original system is destroyed and a new balance is formed due to the addition of new variables.

The hydrological data has the characteristics of co-integration with coefficients shift in practical application, so it is very suitable for runoff prediction. Co-integration with coefficients shift can be divided into three forms: level shift, level shift with trend, and state shift (Gregory & Hansen 1996).

Suppose a standard co-integration regression model is , ; when , there is a co-integration relationship between *y*_{t} and *x*_{t}. In order to construct a structure change co-integration model, virtual variables are introduced, , , where is the time point of a break and *n* is the number of breakpoints. Therefore, the forms of structural break in co-integration with the coefficients shift model are as follows.

### Breakpoint test

The time series breakpoint test can be divided into exogenous structure breakpoint estimation and endogenous structure breakpoint test. The former method estimates the breakpoint artificially based on experience and historical events, which is more subjective and has less consideration for lagging. The latter method tests the breakpoint by data-mining technology, which is more objective. Therefore, this paper uses the unit root test methods such as the recursive test, rolling test and sequential test to examine whether there are structural breakpoints in the sequence (Banerjee *et al.* 1992).

Rolling test: select the sub-sample (usually one-third of the original sample size), and keep the sub-samples unchanged, then carry out the ADF test with intercept and trend items for each sub-sample. Finally, the ADF value is compared with the critical value to determine the structural breakpoint.

*k*= [0.15

*T*, 0.85

*T*], and

*T*is the sample size. Within this range, virtual variables are used to change the year in which hypothetical structural breaks occur, and examine the possibility of breaks during this period. Then the minimum value is selected from the ADF value sequence obtained from the test, and compared with the corresponding critical value, the unit root hypothesis is tested and the structural breakpoint is determined. The test formula is as follows:

In this formula *D*_{t} is defined in two cases:

Case 1: mean breaks model, ;

Case 2: trend breaks model, , (*k* is the year of break).

### Co-integration test

*y*

_{t}is the first-order difference of the variable

*y*

_{t};

*α, β, δ, ζ*are all parameters;

_{i}*t*is time,

*p*is lag order, is white-noise sequence.

If the two time series are stationary or the difference of the same order is stationary, there may be a co-integration relationship. The most commonly used method to test the co-integration relationship of two time series is the EG two-step method. The specific steps are as follows (Engle & Granger 1987).

Step 1: the ordinary least squares method (OLS) is used to carry out static regression for two time series with stationarity or equal-order differential stationarity, and two time series co-integration equations and residual series are obtained.

Step 2: the ADF unit root test is applied to the residual series of the co-integration equation. If the residual series is stationary, the co-integration relationship exists between the two time series.

### Error correction model (ECM)

*λ*is the coefficients of the difference terms of the variables, and reflects the short-term dynamic changes of the model;

*ecm*

_{t}_{−1}is the error correction term, that is, the lag first order of the residual sequence of the regression equation, and it reflects the extent to which the first term of the dependent variable deviates from the long-term balance in short-term fluctuations;

*φ*is the correction coefficient, also known as the adjustment speed, and is usually negative;

*c*is the constant term of the model,

*ɛ*is the white-noise sequence.

_{t}## RESEARCH STEPS

The research steps of this paper are that based on the time series data of rainfall and runoff, with co-integration theory and structure change co-integration theory, two types of ECMs are constructed respectively. Suppose the ECM with co-integration theory is CECM and the ECM with structure change co-integration theory is SCECM. With these two ECMs, the measured sample data are predicted. By comparing the accuracy of CECM with that of SCECM, a conclusion can be drawn. The flow chart is shown in Figure 1.

## RESULTS AND DISCUSSIONS

### Data sources

The source area of the Yellow River is located in the northeastern part of the Qinghai-Tibet Plateau, and the geographic coordinates are 95°50′ ∼ 103°30′ E and 32°10′ ∼ 36°05′ N (as shown in Figure 2). It refers to the area above the Tangnaihai Hydrological Station in the main stream of the Yellow River which belongs to the alpine and semi-humid climate area. The area of the basin is 122,000 km^{2}, and it accounts for 13% of the catchment area of the Yellow River Basin and contributes 33% of the annual runoff of the Yellow River. It is the most important runoff producing area in the Yellow River Basin. The change of runoff in this region has a vital influence and controlling role on the change of water resources in the whole Yellow River Basin.

This paper collected the rainfall and runoff data series of Tangnaihai Hydrological Station in the source area of the Yellow River from 1966 to 2014, as shown in Figure 3. Among them, 44 years of data from 1966 to 2009 are selected to analyze and construct the model, and five years of data from 2010 to 2014 are used to carry out prediction.

As shown in Figure 3, the rainfall and runoff time series at Tangnaihai Station in the source area of the Yellow River from 1966 to 2014 have a similar trend and the statistical correlation coefficient is 0.832, which means they have a good correlation. Table 1 shows the statistical characteristics of the rainfall and runoff time series. The mean value is 557.22 mm for the rainfall time series, and 203.76 billion m^{3} for the runoff time series Although the standard deviation of rainfall is a little more than that of runoff, for coefficient of variation and skewness coefficient the values are both larger than those of rainfall.

Time series . | Mean . | Standard deviation . | Coefficient of variation . | Skewness coefficient . |
---|---|---|---|---|

Rainfall | 557.225 | 58.064 | 0.104 | 0.173 |

Runoff | 203.760 | 54.437 | 0.267 | 0.672 |

Time series . | Mean . | Standard deviation . | Coefficient of variation . | Skewness coefficient . |
---|---|---|---|---|

Rainfall | 557.225 | 58.064 | 0.104 | 0.173 |

Runoff | 203.760 | 54.437 | 0.267 | 0.672 |

### Stationarity test

The ADF unit root test is used to test the stationarity of time series (Dickey & Fuller 1979). Suppose rainfall is expressed as *P* and runoff as *W*. The test results are shown in Table 2.

Variable . | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

P | 0.06367 | −2.62406 | −1.94932 | −1.61171 | 0.697 | no |

ΔP | −5.01584 | −2.62724 | −1.94986 | −1.61147 | 0.000 | yes |

W | −0.18467 | −2.62724 | −1.94986 | −1.61147 | 0.613 | no |

ΔW | −5.40633 | −2.62724 | −1.94986 | −1.61147 | 0.000 | yes |

Variable . | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

P | 0.06367 | −2.62406 | −1.94932 | −1.61171 | 0.697 | no |

ΔP | −5.01584 | −2.62724 | −1.94986 | −1.61147 | 0.000 | yes |

W | −0.18467 | −2.62724 | −1.94986 | −1.61147 | 0.613 | no |

ΔW | −5.40633 | −2.62724 | −1.94986 | −1.61147 | 0.000 | yes |

*Note*: Δ denotes first-order difference, *t*-statistic refers to the significance test of variables; Prob is the lowest confidence-level required to reject the original hypothesis.

It can be seen from Table 1 that the ADF test values of the original time series of rainfall and runoff are larger than the critical values of 1%, 5% and 10% at the significance level. They are non-stationary. After first-order difference, the ADF test values of rainfall and runoff are less than 1% of the critical values at the significance level. They are stationary. Therefore, both rainfall and runoff are first-order difference stationary, that is, *P* ∼ *I*(1), *W* ∼ *I*(1).

### Construction of model without considering structural change

#### Co-integration test

The EG two-step method is used to test the co-integration relationship of the two variables.

*R*

^{2}is the determination coefficient with the range of values of 0 to 1, such that the closer the

*R*

^{2}value is to 1, the better the reliability of the model is.

*DW*is the Durbin–Watson statistic, which indicates whether there is autocorrelation in the sequence of equation residuals. If the value of

*DW*is close to 2, there is almost no autocorrelation and the accuracy of the model is good; is a white-noise sequence.

Step 2: the residual sequence is tested by the ADF test method as shown in Table 3.

From Table 3, it can be seen that the residual sequence is stationary by the ADF test. Therefore, there is a co-integration relationship between the time series of *P* and *W*, which indicates that there is a long-term balance relationship between the rainfall and runoff time series.

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | −5.26083 | −2.61985 | −1.94869 | −1.61204 | 0.000 | yes |

(c,0) | −5.20544 | −3.59246 | −2.93140 | −2.60394 | 0.000 | yes |

(c,t) | −4.95034 | −4.19234 | −3.52079 | −3.19128 | 0.001 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | −5.26083 | −2.61985 | −1.94869 | −1.61204 | 0.000 | yes |

(c,0) | −5.20544 | −3.59246 | −2.93140 | −2.60394 | 0.000 | yes |

(c,t) | −4.95034 | −4.19234 | −3.52079 | −3.19128 | 0.001 | yes |

*Note*: In test type (*c*,*t*), *c* denotes having a constant term and *t* denotes having a trend term (the other meanings are the same).

#### Construction of CECM

Equation (9) shows that runoff is affected by rainfall and its deviation from the balance level in the previous year. The coefficient before Δ*P* is 0.6709, which shows that rainfall has a significant effect on runoff. The coefficient of *ecm _{t}*

_{−1}is −0.7486, which confirms the reverse correction mechanism. It shows that the deviation of runoff from balance this year will be adjusted by 77.86% in the next year.

*R*

^{2}is 0.837008, so the goodness-of-fit of the model is good and the explanatory value is strong.

*DW*is 1.801638, which shows that there is no autocorrelation in the residual sequence.

#### Runoff prediction with CECM

The CECM was used to simulate the annual runoff in the source area of the Yellow River from 1966 to 2009, and the annual runoff from 2010 to 2014 was predicted for verification. The fitting results are shown in Figure 4 below.

Figure 4 shows that the CECM has a good fitting. Within it, 2010–2014 is the prediction period in which the relative errors are less than 20%, the overall mean absolute percentage error (MAPE) of the model is 10.45%, and the MAPE of the prediction period is 11.04%.

### Construction of model considering structural change

#### Breakpoint test

Using the recursive test, the rolling test and the sequential test method, the runoff series of Tangnaihai Station in the source area of the Yellow River from 1966 to 2009 are tested. The test results are shown in Figure 5.

The final test results are shown in Table 4.

Variable . | Recursive test . | Rolling test . | Sequential test of mean breaks . | Sequential test of trend breaks . | Year of structural breaks . |
---|---|---|---|---|---|

W | 2002 | none | 1989 | none | 1989, 2002 |

Variable . | Recursive test . | Rolling test . | Sequential test of mean breaks . | Sequential test of trend breaks . | Year of structural breaks . |
---|---|---|---|---|---|

W | 2002 | none | 1989 | none | 1989, 2002 |

From the test results, only the recursive test and sequential test show breakpoints exist in 2002 and 1989, respectively. It is found that 1989 was the year with the largest runoff and the largest rainfall and 2002 was the year with the lowest runoff and the lowest rainfall.

By investigation, the Longyang Gorges Reservoir in the source area of the Yellow River was put into operation in 1987 (Li & Yang 2004). The Longyang Gorges Reservoir is a large reservoir with multi-year regulation, and the total storage capacity is 24.7 billion m^{3} and the regulating storage capacity is 19.36 billion m^{3}. The construction and operation of the reservoir will have a significant influence on the river runoff. This influence lags behind the construction and operation of water conservancy projects. Therefore, the breakpoint in 1989 is reasonable.

Actually, 2002 was a poor year in the source area of the Yellow River (Tang *et al.* 2004). According to many studies on the Yellow River Basin, climate change is shown by changes in rainfall and the change of rainfall was the main reason for the change of runoff (Wang *et al.* 2013). Less rainfall makes runoff decrease sharply in poor years, resulting in structural changes, and thus a breakpoint appeared in 2002.

According to the above analysis, the breakpoints of the runoff time series can be determined as 1989 and 2002, so the structure change co-integration between rainfall and runoff can be constructed.

#### Construction of structure change co-integration model

The structure change co-integration equation with breakpoints is introduced to illustrate that in the long run, the overall goodness-of-fit *R*^{2} values of Equations (1)–(3) are 0.7767, 0.7767 and 0.7879 respectively, which are higher than that of the co-integration equation without breakpoints.

Equation (3) shows that the coefficient of elasticity between *W* and *P* from 1966 to 1989 is 0.8126, which shows that *P* has a great influence on *W*. That is, for each 1% increase in *P*, the synchronous increase in *W* is 81.26%. From 1989 to 2002, the coefficient of elasticity between *W* and *P* is 0.5893, which is 0.2233 lower than that before, and the effect of *P* on *W* is reduced. That is, for every 1% increase in *P*, the synchronous increase in *W* is 58.93%. After 2002, the coefficient of elasticity between *W* and *P* is 0.9866, which is 0.3973 higher than that before. The influence of *P* on *W* increases. That is, for each 1% increase in *P*, the synchronous increase in *W* is 98.66%.

These results show that there is a positive correlation between runoff and rainfall in the source area of the Yellow River. After the first break in 1989, the impact of rainfall on runoff decreased, and after the second break in 2002, the impact of rainfall on runoff increased.

#### Structure change co-integration test

The ADF test method is used to test the residual sequence of the above-mentioned structure change co-integration Equations (1)–(3). The results are shown in Tables 5–7.

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.29770 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.23107 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.14066 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.29770 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.23107 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.14066 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.29796 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.23092 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.14208 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.29796 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.23092 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.14208 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.21142 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.14203 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.06006 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

Test type (c,t)
. | ADF test value . | Critical values of t-statistic. | Prob . | Stationary . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

(0,0) | 5.21142 | −2.62119 | −1.94889 | −1.61193 | 0.000 | yes |

(c,0) | 5.14203 | −3.59662 | −2.93316 | −2.60487 | 0.000 | yes |

(c,t) | 5.06006 | −4.19234 | −3.52079 | −3.19128 | 0.000 | yes |

#### Construction of SCECM

The residual sequences of the above three structure change co-integration equations are stationary. Therefore, the SCECM can be constructed to understand the short-term dynamic relationships between rainfall and runoff, and thus the prediction accuracy is improved. In order to compare the SCECM and CECM, the SCECMs constructed by Equations (1)–(3) above are named model 1, model 2 and model 3 respectively. The structure change ECMs are constructed as follows.

The negative coefficients of *ecm _{t}*

_{−1}in models 1, 2 and 3 confirm these models are in accord with the error correction mechanism. The

*R*

^{2}values are all greater than 0.8, which make the model more explanatory. The

*DW*values are all close to 2, and there is no autocorrelation in the residual sequences. Therefore, this can reflect the long-term balance and short-term fluctuation relationships between rainfall and runoff in the source area of the Yellow River after introducing the breakpoints.

#### Runoff prediction with SCECM

The structure change models 1, 2 and 3 are used to simulate the annual runoff in the source area of the Yellow River from 1966 to 2009, and the annual runoff from 2010 to 2014 was predicted for verification. The fitting results are shown in Figures 6–8.

From Figures 6–8, it can be seen that the fitting result of the SCECM is better. Within it, 2010–2014 is the prediction period. For model 1, the relative errors in the prediction period are less than 20% except for the error of 20.28% in 2012, and the overall MAPE of model 1 is 9.58%. For model 2, the error in the prediction period is less than 20% except for the error of 20.33% in 2012, and the overall MAPE of model 2 is 9.58%. For model 3, the prediction period errors are all less than 20%, and the overall MAPE of model 3 is 10.06%. The MAPEs of models 1, 2 and 3 are all less than the CECM by 10.45%. Therefore, the fitting of SCECM is better than that of CECM.

### Model comparison

In order to compare the models, the mean square error (MSE) of each model is calculated as in Table 8. It can be seen from Table 8 that the MSEs of the SCECMs are all smaller than that of the CECM, while for the SCECMs, the MSE of the models is smallest. This means model 3 is more available among these models.

Type . | CECM . | SCECM . | ||
---|---|---|---|---|

Model 1 . | Model 2 . | Model 3 . | ||

MSE | 26.4712 | 25.2139 | 24.6478 | 23.8456 |

Type . | CECM . | SCECM . | ||
---|---|---|---|---|

Model 1 . | Model 2 . | Model 3 . | ||

MSE | 26.4712 | 25.2139 | 24.6478 | 23.8456 |

Moreover, the prediction errors of models 1, 2, 3 and CECM are compared. Table 9 shows the comparison of the relative errors of the SCECM and CECM.

Year . | Relative error of CECM (%) . | Relative error of model 1 (%) . | Relative error of model 2 (%) . | Relative error of model 3 (%) . |
---|---|---|---|---|

2010 | 11.42 | 5.13 | 5.19 | 0.78 |

2011 | 18.10 | 3.36 | 3.30 | 2.65 |

2012 | 10.12 | 20.28 | 20.33 | 17.82 |

2013 | 1.01 | 14.03 | 14.12 | 9.61 |

2014 | 14.54 | 0.91 | 1.03 | 4.57 |

Mean | 11.04 | 8.74 | 8.79 | 7.08 |

Year . | Relative error of CECM (%) . | Relative error of model 1 (%) . | Relative error of model 2 (%) . | Relative error of model 3 (%) . |
---|---|---|---|---|

2010 | 11.42 | 5.13 | 5.19 | 0.78 |

2011 | 18.10 | 3.36 | 3.30 | 2.65 |

2012 | 10.12 | 20.28 | 20.33 | 17.82 |

2013 | 1.01 | 14.03 | 14.12 | 9.61 |

2014 | 14.54 | 0.91 | 1.03 | 4.57 |

Mean | 11.04 | 8.74 | 8.79 | 7.08 |

Table 9 shows that the average relative errors of the SCECM with breakpoints are all less than that of the CECM. The average relative error of model 3 is the smallest at 7.08%, which is 3.96% less than that of the CECM. It shows that state-switched structure change co-integration model 3 has higher prediction accuracy.

Certainly, some research on runoff prediction with co-integration theory have also been carried out to predict runoff and achieved good results (Chang & Liu 2005; Zhang *et al.* 2006, 2013), but the innovation of our research results is to introduce structure change co-integration theory, so the prediction accuracy is better than theirs.

## CONCLUSIONS

- (1)
Using co-integration theory, the co-integration relationship between rainfall and runoff in the source area of the Yellow River is revealed. By constructing the CECM the long-term balance and short-term fluctuation relationships between rainfall and runoff are revealed, and it shows that runoff is affected by rainfall and the deviation of runoff from balance this year will be adjusted by 77.86% in the next year. The runoff in the source area of the Yellow River is predicted with the MAPE in the prediction period being 11.04%.

- (2)
Using the endogenous structure breakpoint test method the breakpoints of the runoff time series are determined, and its breakpoints are tested in 1989 and 2002. The test results reflect the profound impact of typical human activities, reservoir construction and climate change, on runoff in source areas of the Yellow River, and also the breakpoint in 1989 is mainly effected by reservoir construction while in 2002 it is effected by rainfall changes.

- (3)
Using structure change co-integration theory, the change of the relationship between rainfall and runoff before and after the breakpoints is revealed. It shows that from the first break in 1989, the impact of rainfall on runoff decreased, but from 2002 it increased. By constructing the SCECM the runoff in the source area of the Yellow River is predicted more accurately, the prediction MAPE of the state-switched SCECM is 7.08%, which is 3.96% less than that of the CECM, and the prediction accuracy has been greatly improved.

Therefore, it can provide technical reference for researchers in the hydrology field.

## ACKNOWLEDGEMENTS

This research is supported by the National Key R&D Program of China (Grant No. 2018YFC0406501), Program for Innovative Talents (in Science and Technology) at University of Henan Province (Grant No. 18HASTIT014), and Foundation for University Youth Key Teacher of Henan Province (Grant No. 2017GGJS006).