Increasing water-issues demand that water resources managers know and predict the uncertain characteristics of river runoff well. In this paper, the fluctuating periods and local features of runoff with multi-time scales are analyzed by the empirical mode decomposition method. With the set pair analysis method, the uncertainty properties of runoff series with different multi-time scales are expressed. Meanwhile, cointegration theory is introduced to indicate the long-term equilibrium relationships between runoff series, and then the runoff prediction model is proposed based on the error correction model (ECM). The results show that the runoff series of Heihe River in northwest China exhibit complex relations with different periodic fluctuations and changing laws. The identity degree is the main relation between two runoff series, especially in the short period. Both the original series and decomposed components are all cointegrated, and the established runoff prediction model based on the ECM can simulate and predict river runoff well.

## INTRODUCTION

Influenced by the meteorological system, underlying surface and human activities, the river runoff series presents uncertain characteristics. For river water management and regulation, mathematical statistics (including the stochastic analysis method, fuzzy analysis method, chaos theory, fractal theory, etc.) are widely used to predict river runoff based on statistical data series. In recent years, in order to explore the non-linear characteristics of hydrological phenomena, some non-linear time series analysis methods such as wavelet analysis and artificial neural networks have become prevalent in the hydrological field and have also gained many valuable achievements (Bazartseren *et al.* 2003; El-Shafie *et al.* 2007; Lee *et al.* 2010; Abiyev 2011; Inoussa *et al.* 2012; Kulwinder & Rashmi 2013; Longqin & Shuangyin 2013). Actually, wavelet analysis can reveal the local features and fluctuations of random variables, but is limited in the false harmonics and selection of wavelet base function, and some studied results are questionable. Huang *et al.* (1998) proposed the empirical mode decomposition (EMD) method to extract fluctuations and trends from the original time series. Application of EMD proved that it can obtain the spectra with a higher resolution in both the time and frequency domains (Qu & Cheng 2006; Zhang *et al.* 2014).

For the two river runoff time series upstream and downstream, there is a close hydraulic connection. Although the two river runoff time series upstream and downstream have their own changing laws and characteristics, uncertain characteristics and ambiguous relationships are presented. This prevents the actual correlations of the runoff series from being revealed and poses a great challenge to river water management. Zhao & Xuan (1989) divided this uncertainty into three states: the identity, the discrepancy and the contrary, and proposed the set pair analysis (SPA) method to measure this uncertainty. Because SPA can evaluate the uncertainty of two random variables quantitatively, it has quickly gained widespread use. In considering the complex fluctuations and uncertainty, SPA and EMD are combined to reveal the quantitative relations of hydrological variables with multi-time scales (Feng *et al.* 2009a, 2009b; Zhang *et al.* 2013a, 2013b).

Usually, a river runoff series is a non-stationary time series. However, it was often assumed to be stationary, and thus ‘spurious regression’ was produced. For overcoming this drawback, cointegration theory and the error correction model (ECM) appeared (Engle & Granger 1987). Firstly, cointegration theory described the long-term equilibrium relationships between variables in an economic system (Han *et al.* 2004; Seung-Hoon 2007), and then it was applied in the hydrological field to study the non-stationarity of hydrological time series (Chang & Liu 2005; Zhang *et al.* 2006, 2008). ECM is an important manifestation of cointegration theory and effectively exhibits static expression and dynamic characteristics. Moreover, some scholars (Xu *et al.* 2007; He *et al.* 2008) introduced wavelet analysis method into cointegration theory and ECM to establish the multi-resolution cointegration prediction model. But these studies have been confined to econometrics while application in the field of hydrology is rare. So the current study includes the following objectives: (1) to reveal the periodic fluctuations and changing laws of two runoff series of Heihe River in northwest China firstly with multi-time scales; (2) to analyze the uncertain relations between these two runoff series with multi-time scales; (3) to establish the prediction model of river runoff according to the cointegration equilibrium relationships.

## MATHEMATICAL METHODS

### Outline of the mathematical methods

### EMD method

*et al.*1998):

*SD*. Eventually, the original series is rewritten as (Huang

*et al.*1998): The criterion of

*SD*is defined as (Huang

*et al.*1998): where

*T*is the length of the time series, and the value is between 0.2 ∼ 0.3.

By the EMD method, the given time series can be decomposed into several IMF components and one residue . Here, each IMF satisfies the two following conditions: (1) in the whole data range, the number of local extrema and the number of zero-crossings must be equal, or at most have a difference of 1; (2) at any point, the mean value of the upper envelope formed by all local maxima and the lower envelope formed by all local minima is zero.

### SPA method

*A*and

*B*. The properties of these two sets include identity degree, discrepancy degree, and contrary degree and are given as follows (Huang

*et al.*1998): where is the connection degree of the set pair,

*S*is the number of identity characteristics,

*F*is the number of discrepancy characteristics,

*P*is the number of contrary characteristics,

*N*denotes the total number of characteristics of the set pair,

*i*and

*j*respectively are the coefficient of the discrepancy degree and the contrary ,

*i*is an uncertain value between −1 and 1, i.e. , and

*j*is specified as −1.

### ECM method

#### Cointegration theory

Cointegration describes the long-term balance relationships among the non-stationary time series. If an original time series is stationary, it is . If a original time series is not stationary, but becomes stationary after the first differencing, it is said to be integrated of order and denoted as . The standardization expression is to suppose and are two time series of , and the linear combination of is stationary, that is . Thus and are cointegrated, and is called the cointegrated vector.

The stationarity of variables needs to be tested before the cointegration analysis. The widely used stationary test is the unit root test based on the Augmented Dickey–Fuller (ADF) test (Dickey & Fuller 1979). Secondly, with the ordinary linear regression method, the residual for two variables is obtained. If the residual is , the two variables are cointegrated.

#### ECM method

## APPLICATION

### Data series

^{4}km

^{2}and a typical continental climate, is the second biggest inland river basin in northwest China (shown as Figure 2). The Heihe River basin is located in the middle part of the Hexi Corridor (96 °42′–102 °00′E, 37 °41′–42 °42′N) covering three provinces: Qinghai, Gansu and Inner Mongolia. Yingluoxia and Zhengyixia are the boundaries of the upper reaches, the middle reaches, and the lower reaches of the river basin. An area of 2.56 × 10

^{5}km

^{2}with 185 km river length between Yingluoxia and Zhengyixia is the middle reaches.

In this part, irrigation agriculture in Zhangye City has developed very well and its water use accounts for 87% of the total water resources. Meanwhile, it is the important base of grain production in Gansu province. In this catchment, there is 88% of the population and production of 88% GDP of the Heihe River basin. So the water is mainly consumed in this area. Below Zhengyixia is an area where the rivers and lakes disappear, but the Ejinaqi Oasis in Inner Mongolia is here. Moreover, an important national defense scientific research base in China -- the Jiuquan satellite launch center – is just in Ejinaqi Oasis. In recent years, due to the significantly reduced runoff discharge from Zhengyixia, the ecosystem in this area has deteriorated seriously. The river drying time and length tend to increase yearly and the local groundwater level has descended rapidly. Since the 1980s, the annual decrease of forest land with vegetation coverage of more than 70% has reduced 14 × 10^{3}m^{2}, and land desertification and sandstorm damage has intensified. According to early 1960s aerial photos and TM image data interpretation in the 1980s, the Gobi Desert area around the Ejinaqi Oasis in which vegetation coverage is less than 10% has increased by about 462 km^{2} with an average annual increase of 23.1 km^{2}. Therefore, the water-use contradiction between the economy and social development in the middle reaches and ecosystem protection in the lower reaches has existed for a long time.

For alleviating the water-use contradiction between the middle and lower reaches of the Heihe River basin, in 1997, China's state council approved the project of ‘water resources allocation in the main stream of Heihe River’ to regulate Heihe river water. Especially it illustrates that ‘when the annual average runoff of Yingluoxia is 15.8 × 10^{8}m^{3}, the allocated discharge runoff of Zhengyixia is 9.5 × 10^{8}m^{3}; when the runoff of Yingluoxia is 17.1 × 10^{8}m^{3} with 25% guaranteed frequency, the allocated discharge runoff of Zhengyixia is 10.9 × 10^{8}m^{3}. Moreover, for the year with poor rain, the water allocation should consider water use demand of both provinces and water saving measures of Gansu province as well.’

### Uncertainty analysis of runoff series

For the IMF1 component, a quasi-periodic fluctuation of 3 years is presented in both the two decomposed runoff series. For the IMF2 component, a quasi-periodic fluctuation of 6 to 9 years is presented in the runoff series of Yingluoxia and 5 to 7 years in the runoff series of Zhengyixia. Before the 1970s, the largest fluctuations in these two runoff series are all before the 1970s, with the similar fluctuations. But after that, the fluctuation in the runoff series of Zhengyixia decreases and remains stable, while for the runoff series of Yingluoxia, the fluctuation is still higher except that it tends to reduce in the middle 1990s.

For the IMF3 component, a quasi-periodic fluctuation of 9 to 11 years is presented in the runoff series of Yingluoxia. The largest fluctuation appears in the early stage, then it begins to decrease, and then increases again. In the runoff series of Zhengyixia, a quasi-periodic fluctuation of 9 to 12 years exists. The fluctuations are larger except from the early 1960s to the middle 1980s. Similarly to the IMF2 component, the quasi-periodic fluctuation of the IMF3 component corresponds to that of solar activity, which leads to the climate changes and also further induces the runoff variation.

For the IMF4 component, a quasi-periodic fluctuation of 17 years is presented in the runoff series of Yingluoxia and 14 years in the runoff series of Zhengyixia. These two runoff series have similar fluctuations. The residue component indicates the overall opposite trend of these two runoff series. It shows that the runoff of Yingluoxia increases firstly and then decreases, while for Zhengyixia the runoff series decreases firstly and then increases. In general, the runoff series of Yingluoxia and Zhengyixia are rich and poor alternately.

The residues in the runoff series of Yingluoxia and Zhengyixia have no periodic fluctuation, so four set pairs are formed with four decomposed IMF components. For simplicity, the mean *SD* method is adopted to classify these IMF components into three states: the rich, the normal, and the poor. The corresponding classified ranges are respectively described as (−∞, *EX*−0.5*d)*, [*EX*−0.5*d*, *EX* + 0.5*d*], and (*EX* + 0.5*d*, +∞), where *EX* is the mean value and *d* is the standard deviation. The value of *i* is specified as 0.5, so according to Equation (5), the identity degree *a*, the discrepancy degree *b*, the contrary degree *c*, and the connection degree are calculated as shown in Table 1.

Set pairs . | a
. | b
. | c
. | μ
. |
---|---|---|---|---|

IMF1 | 0.84 | 0.13 | 0.03 | 0.87 |

IMF2 | 0.68 | 0.32 | 0.00 | 0.84 |

IMF3 | 0.47 | 0.42 | 0.11 | 0.56 |

IMF4 | 0.76 | 0.23 | 0.02 | 0.85 |

Set pairs . | a
. | b
. | c
. | μ
. |
---|---|---|---|---|

IMF1 | 0.84 | 0.13 | 0.03 | 0.87 |

IMF2 | 0.68 | 0.32 | 0.00 | 0.84 |

IMF3 | 0.47 | 0.42 | 0.11 | 0.56 |

IMF4 | 0.76 | 0.23 | 0.02 | 0.85 |

It is assumed that a change period of 3 years is a short period, 5 to 9 years is a middle period, 9 years or 12 years is a mid-long period, and 14 years or 17 years is a long period. From Table 1, it can be seen that with increasing fluctuation period, the identity degree of the two runoff series in four IMF components is the largest. Especially in the IMF1 component, the identity degree has the largest value of 0.84, which is consistent with the connection degree. It means the runoff series in Yingluoxia and Zhengyixia have similar characteristics in the short period and that there is also a better correlation between them.

Only the identity degree and discrepancy degree exist in the IMF2 component, and the connection degree is little different from the IMF1 component and the IMF4 component. In the IMF3 component, the identity degree is only 0.47, but the discrepancy degree and the contrary degree are all larger compared with the other IMF components, so the runoff series in Yingluoxia and Zhengyixia reveal greater uncertainty in the mid-long period. This is also demonstrated by the minimum of the connection degree of 0.56. Except for the IMF1 component, the identity degree and the connection degree in the IMF4 component are the second largest, so the close relationship between two runoff series is also shown in the long period.

### Runoff prediction analysis

With the runoff series of Yingluoxia and Zhengyixia from 1945 to 1994, the prediction model based on the ECM is established. Suppose that the original runoff series of Yingluoxia and Zhengyixia are denoted as and , and the decomposed IMF components are marked as and , and , and , and , and . With the EVIEWS software, the stationarity of the runoff series and (*j**=* 0, 1, 2, 3, 4, 5) is tested by the ADF unit root test method (shown as Table 2), in which_{,} represents the first differencing of the original runoff series. It can be seen that although the original runoff series of Yingluoxia and Zhengyixia are not stationary, after the first differencing, they become stationary. For other IMF components, their ADF test statistics are all less than the test critical values regardless of the significance level of 1%, 5%, or 10%, so they are all stationary.

Time series . | Variables . | ADF test statistic . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|

1% level . | 5% level . | 10% level . | ||||

The original | −0.402447 | −2.6013 | −1.9459 | −1.6186 | No | |

−0.791677 | −2.6013 | −1.9459 | −1.6186 | No | ||

−9.761599 | −2.6019 | −1.9460 | −1.6187 | Yes | ||

−8.984184 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF1 component | −7.549249 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−6.903153 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF2 component | −19.04086 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−17.67327 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF3 component | −17.96059 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−5.908887 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF4 component | −16.19274 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−42.98358 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF5 component | −25.37065 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−22.02825 | −2.6013 | −1.9459 | −1.6186 | Yes |

Time series . | Variables . | ADF test statistic . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|

1% level . | 5% level . | 10% level . | ||||

The original | −0.402447 | −2.6013 | −1.9459 | −1.6186 | No | |

−0.791677 | −2.6013 | −1.9459 | −1.6186 | No | ||

−9.761599 | −2.6019 | −1.9460 | −1.6187 | Yes | ||

−8.984184 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF1 component | −7.549249 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−6.903153 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF2 component | −19.04086 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−17.67327 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF3 component | −17.96059 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−5.908887 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF4 component | −16.19274 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−42.98358 | −2.6013 | −1.9459 | −1.6186 | Yes | ||

The IMF5 component | −25.37065 | −2.6013 | −1.9459 | −1.6186 | Yes | |

−22.02825 | −2.6013 | −1.9459 | −1.6186 | Yes |

Using the established runoff prediction model, the runoff series of Zhengyixia from 1996 to 2006 was predicted as shown in Table 3. Compared with the observed value, the largest relative error of prediction is in 2000 at 10.36%, and others are all at less than 10%. This provides a new way to predict the river runoff.

Year . | Predicted value . | Observed value . | Relative error (%) . |
---|---|---|---|

1996 | 9.86 | 9.53 | −3.44 |

1997 | 5.59 | 5.15 | −8.62 |

1998 | 11.46 | 11.23 | −1.97 |

1999 | 6.69 | 7.03 | 4.83 |

2000 | 5.91 | 6.59 | 10.36 |

2001 | 5.12 | 5.02 | −2.08 |

2002 | 8.06 | 8.67 | 7.06 |

2003 | 10.54 | 10.50 | −0.38 |

2004 | 6.21 | 6.45 | 3.78 |

2005 | 10.48 | 11.13 | 5.81 |

2006 | 10.49 | 11.46 | 8.52 |

Year . | Predicted value . | Observed value . | Relative error (%) . |
---|---|---|---|

1996 | 9.86 | 9.53 | −3.44 |

1997 | 5.59 | 5.15 | −8.62 |

1998 | 11.46 | 11.23 | −1.97 |

1999 | 6.69 | 7.03 | 4.83 |

2000 | 5.91 | 6.59 | 10.36 |

2001 | 5.12 | 5.02 | −2.08 |

2002 | 8.06 | 8.67 | 7.06 |

2003 | 10.54 | 10.50 | −0.38 |

2004 | 6.21 | 6.45 | 3.78 |

2005 | 10.48 | 11.13 | 5.81 |

2006 | 10.49 | 11.46 | 8.52 |

## DISCUSSION

The studied result shows various periodic fluctuations exist in the multi-time scales of the runoffs of Yingluoxia and Zhengyixia. This reflects that river runoff has very complex random characteristics. Meanwhile, the different quasi-periodic fluctuations are of different periods, and the periodic fluctuations of the river runoff change severely in short-term periods but become stable in long-term periods. In short-term periods, the periodic fluctuations show the detailed changes of river runoff, such as high and low variation. In long-term periods, the periodic fluctuations indicate the development trend. So it can be said that the periodic fluctuations in short-term periods reveal micro-characteristic changes and those in long-term periods reflect the macro-development trend. The periodic fluctuations in short periods are involved in the periodic fluctuations in mid-long periods. Moreover, Zhang & Xue (1994) showed that quasi-periodic fluctuations of 3.5 years and 4 to 8 years exist in the El Niño phenomenon. So the IMF1 component and IMF2 component in the runoff series of Yingluoxia and Zhengyixia are in agreement with the El Niño phenomenon, which suggests that the runoff changes of Yingluoxia and Zhengyixia in short-term periods are controlled by the El Niño phenomenon.

The uncertainty in the two runoff series shows that the identity degree is the main relation of these two runoff series, especially in short periods. The IMF1 component has an important influence on the relationships and characteristics between the original runoff series of Yingluoxia and Zhengyixia. The overall trends of these two runoff series are reversed, which may be a good indication of the streamflow forecast in the long term. Especially, the runoff changes in short periods can basically present the average state of the original runoff series. So, the short-term forecast accuracy of the runoff should be enhanced for the rational regulation and planning of river water resources.

The runoff prediction model based on the ECM can reveal the local features of runoff series on different time scales, and also exhibit the cointegration relationship of the original and decomposed runoff time series, so the constructed runoff prediction model can capture more information within the variables to depict their relations in detail. Compared with other runoff prediction methods (Islam & Kothari 2000; Nayak *et al.* 2004; Collischonn *et al.* 2007), it is simple, but effective. The prediction results show that not only the simulated value is in agreement with the observed value, but also the prediction of runoff has a high accuracy. For a river without too much other observational data in practical application, the runoff in the lower reaches can be predicted only by the runoff in the upper reaches with rational prediction accuracy, which can be regarded as a good method.

Actually, there are some limitations. First, just for simplicity, the coefficients (such as *i*, *j*, etc.) are specified in the SPA analysis method, and they denote the general state. The different values of the coefficients might result in different connections. Second, river runoff is influenced by the meteorological system, the underlying surface and human activities, so if more variables with effects are considered in the study, the prediction results will be more accurate. Finally, the prediction model is based on linear relations of river runoff. This means the study has assumed that linear relations exist between the runoff series in the lower reaches and the upper reaches even though they present nonlinear relations in practice. All of these may be a research effort in the future. Even so, the study establishes the uncertainty relations between river runoffs with multi-time scales, and based on that, the runoff prediction method is proposed to achieve rational results.

## CONCLUSIONS

The annual runoff series of Yingluoxia and Zhengyixia in the Heihe River basin are found to have a complex relationship from 1945 to 2006. Periodic fluctuations exist in their different decomposed time series, but their development trends present completely opposite directions.

The uncertain relations between the runoff series of Yingluoxia and Zhengyixia can be described as the identity, the discrepancy and the contrary. But not all of them exist in their different IMF components. The identity degree is the main relation of these two runoff series, especially in short periods. The IMF1 component has an important influence on the relationships and characteristics between the original runoff series of Yingluoxia and Zhengyixia.

The long-term cointegration equilibrium relationships exist both in their original time series and in their decomposed time series. Also, with the application of these cointegration relationships, the river runoff can be effectively predicted. The case study shows that the predicted results have reasonable prediction accuracy, and it provides a new way to predict the river runoff.

## ACKNOWLEDGEMENTS

This research is supported by the National Natural Sciences Foundation of China (Grant No. 51309202), Outstanding Young Talent Research Fund of Zhengzhou University (Grant No. 1521323002), the National Key Research and Development Plan (Grant No. 2016YFC0401407), Program for Innovative Research Team (in Science and Technology) in University of Henan Province (No. 13IRTSTHN030) and the Open Laboratory of Water Conservancy and Science of Key Disciplines in Henan Province.