## Abstract

In order to reveal the multi-time scale of rainfall, runoff and sediment in the source area of the Yellow River and improve the accuracy of annual runoff forecast, the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method is introduced to decompose the measured rainfall, runoff and sediment data series of the Tangnahai hydrological station in the source area of the Yellow River of China. With the co-integration theory, two new error correction models (ECMs) for the forecast of annual runoff in the source area of the Yellow River are constructed. The application of these two methods solves the problem of pseudo-regression caused by nonlinearity and non-stationary of hydrological time series. The results show that rainfall, runoff and sediment in the source area of the Yellow River have multi-time scales and the component sequences have co-integration relationships. For two new ECMs, the CEEMDAN component ECM has better forecast accuracy than the original sequence one. The relative error of all forecasted values is less than 15% except 2009, and the accuracy has reached level A.

## HIGHLIGHTS

The research on the multi-time scale change law of hydrological variables reveals the multi periodic change law of hydrological variables and provides a scientific basis for the rational development of water resources.

The non-stationary and nonlinear processing of hydrological variables can avoid spurious regression and make the result more accurate.

Study on the co-integration relationship of rainfall, runoff and sediment.

Study on multi-time scale dynamic relationship among rainfall, runoff and sediment.

Multi-time scale prediction of river runoff provides a technical reference for the effective protection and scientific operation of water resources.

### Graphical Abstract

## INTRODUCTION

Rainfall, runoff and sediment are important hydrological variables with complex relationships in the source area of the Yellow River. Accurately grasping the changes in these hydrological variables plays a vital role in the change of water resources throughout the Yellow River Basin (Chen & Guo 2016; Wang *et al.* 2017, 2018a). Besides, rainfall, runoff and sediment analysis are hot issues for scholars at home and abroad (Ramana *et al.* 2013; Ling *et al.* 2017; Li & Liu 2018; Wang *et al.* 2018b). At present, many scholars have conducted a lot of research on the relationship between rainfall and runoff (Nastiti *et al.* 2018; Tarasova *et al.* 2018; Chu *et al.* 2019), and many others have also conducted research on the relationship between runoff and sediment (Hou *et al.* 2013; Zhang *et al.* 2014; Wang *et al.* 2015a, 2015b) and achieved great results. However, the results of combining the three together for research are relatively few.

Runoff forecasting has always been a hot issue in the field of hydrology (Zhang *et al.* 2017a, 2017b; Zhao *et al.* 2017). The runoff forecasting with the historical data can not only realize the rational development and utilization of runoff resources but also has significance for the planning, construction and scheduling of water conservancy projects (Xie *et al.* 2019). At present, the forecast of river runoff often assumes that the time series is stationary. However, because of the influence of climate change, underlying surface and human activities, the statistical characteristics of hydrological time series always change with time. Therefore, most hydrological time series are nonlinear and non-stationary (Zhang *et al.* 2013a, 2013b). With the development of computer technology, many researchers use soft computing techniques to study hydrological variables with highly accuracy (Rezaie-Balf *et al.* 2017; Mosavi *et al.* 2018; Guru & Jha 2019). The common runoff forecast models include artificial neural network (ANN) model (Meng *et al.* 2015; Sezen & Partal 2019), support vector regression (SVR) model (Yaseen *et al.* 2018; Wu *et al.* 2019) and autoregressive moving average (ARMA) model (Wang *et al.* 2015a, 2015b, 2019). These models with non-stationary time series data will lead to pseudo-regression (Lee & Yu 2009; Jin *et al.* 2017), so their hydrological element simulation and forecasting are unbelievable.

Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) is an effective method for dealing with nonlinear and non-stationary time series (Torres *et al.* 2011), which is an improvement on the empirical mode decomposition (EMD) and ensemble empirical mode decomposition (EEMD) (Huang *et al.* 1998; Wu & Huang 2009). The EMD and EEMD methods are widely used in the fields of hydrology and water resources (Zhang *et al.* 2013a, 2013b, 2019a, 2019b; Ouyang *et al.* 2016; Adarsh & Reddy 2018). However, the EMD method presents the mode confusion, and the EEMD method remains a noise residual problem. The CEEMDAN method solves both of these problems (Colominas *et al.* 2014) and is applied in many fields (Antico *et al.* 2016; El Bouny *et al.* 2019).

The co-integration theory was proposed by Engle & Granger (1987), which can deal with the non-stationary problem of time series and reveal the long-term equilibrium and short-term fluctuation between variables. Now, it is widely used in econometrics and the field of hydrology (Yoo 2007; Zhang *et al.* 2015). Meanwhile, this theory can also be combined with other data analysis methods so as to improve the accuracy of calculation (Zhang *et al.* 2017a, 2017b, 2019a, 2019b).

The innovation of this paper is to combine the CEEMDAN method with the co-integration theory to construct the three-variable CEEMDAN co-integration error correction model (ECM) for rainfall, runoff and sediment in the source area of the Yellow River of China to forecast the river runoff. The first is to use CEEMDAN to decompose rainfall, runoff and sediment in multi-time scales, and know the changing laws and poly-cycle and obtain the corresponding stationary time series. The second is to reveal the long-term equilibrium and short-term fluctuation relationship of the original and component time series of rainfall, runoff and sediment in the source area of the Yellow River according to the co-integration theory, and to clarify their influencing relationships. The last is to construct two new ECM models of rainfall, runoff and sediment including the CEEMDAN component ECM model and the original sequence ECM model to forecast the river runoff.

## STUDY METHODS AND STEPS

### CEEMDAN method

The CEEMDAN method is a time-frequency domain analysis method. It can further eliminate the mode effect by adding adaptive noise and has the strong adaptability and better convergence. Usually, the CEEMDAN method is used to deal with nonlinear and non-stationary time series.

The CEEMDAN algorithm steps are as follows:

In the formula, is the first IMF component after EMD decomposition of signal *x _{i}*(

*t*) with white noise for the

*i*th time.

*r*(

_{n}*t*) satisfies one of the following conditions: (1) cannot be further decomposed by EMD; (2) meet IMF conditions; (3) the number of local extreme points is less than 3. Finally, the original signal

*x*(

*t*) can be decomposed into

*n*IMF components and a trend term

*r*(

_{n}*t*):

### Co-integration theory

#### Co-integration concept

Co-integration describes the long-term equilibrium relationship between time series. If a time series is non-stationary but becomes stationary after *d*-difference, it is called *d*-order simple integer, which is recorded as *I*(*d*). If the time series itself is stationary, it is recorded as *I*(0). The two time series are defined as and . If meeting the following conditions:

- (1)
*X*∼_{it}*I*(*d*) and*Y*∼_{it}*I*(*d*), (*i*= 1,2,…,*n*),*d*is an integer; - (2)
there is a constant

*β*, which makes*Y*_{t}*−**βX*∼_{t}*I*(0);

Then, *X _{t}* and

*Y*are co-integrated, and

_{t}*β*is called the co-integration vector.

#### Stationary test

In the formula, is the first-order difference of variable *y _{t}*;

*α*,

*β*,

*δ*,

*ζ*are all parameters;

_{i}*t*is the time;

*p*is the lag order;

*ε*is the white noise process.

_{t}#### Co-integration test

The E.G. two-step method is a common method for testing the co-integration relationship between time series, which was proposed by Engle & Granger (1987).

The first step of this method is to use the ordinary least square method (OLS) to regress multiple variables and get a residual sequence;

The second step is to test the stationarity of time series with the ADF unit root test on the residual sequence obtained in the first step. If the residual sequence is stationary, it is proved that the variables are co-integrated.

#### Error correction model

If the time series is co-integrated, an ECM can be constructed. This model describes the long-term equilibrium and short-term fluctuations between variables, and the modeling steps are as follows:

In the formula, is a constant term, and are the coefficients of the difference terms of each variable, which reflects the short-term dynamic changes of the model; *ecm*(−1) is an error correction term, which reflects the degree to which the former term deviates from the long-term equilibrium in short-term fluctuations; *φ* is the correction coefficient, also called the adjustment speed, usually a negative value; *ε _{t}* is a white noise sequence.

### Study steps

By using the CEEMDAN method, the time series of rainfall, runoff and sediment in the source area of the Yellow River are decomposed to obtain IMF component sequences at different time scales. Furthermore, the co-integration theory is used to construct the ECM for the original time series (ECM-OTS) and the CEEMDAN component sequences (ECM-CEEMDAN), and then, the runoff is forecasted by ECM-OTS and ECM-CEEMDAN, respectively. Finally, the runoff forecasted value of each IMF component is reconstructed to get the runoff forecasted value of ECM-CEEMDAN, and the fitting value and forecast accuracy of these two ECM models are compared to draw a conclusion. The flow chart of study steps is shown in Figure 1.

## RESULTS AND CONCLUSION

### Data source

The source area of the Yellow River refers to the area above the Tangnaihai hydrological station, which is located in the northeast of the Qinghai Tibet Plateau of China. The geographic coordinates are between 95°50′–103°30′ E and 32°10′–36°05′ N (as shown in Figure 2), the basin area is 122,000 km^{2}, and the average annual runoff is 20.37 billion m^{3}. The water source is mainly supplied by rainfall, followed by glacial snow melting water and groundwater. The change of runoff in the source area of the Yellow River has a vital influence on the change of water resources in the whole Yellow River Basin.

The measured rainfall, runoff and sediment time series from 1966 to 2013 at Tangnaihai hydrological station are obtained by the Bureau of Meteorology and the Bureau of hydrology and water resources are shown as in Figure 3.

Table 1 shows the statistical characteristics of rainfall and runoff time series. The mean value is 556.516 mm for rainfall time series, and 203.885 billion m^{3} for runoff time series, and 1,277.245 × 10^{4} t for sediment time series. The standard deviation of runoff is less than rainfall and sediment. For the coefficient of variation and skewness coefficient, the calculated value of sediment is larger than that of rainfall and runoff.

Time series . | Mean . | Standard deviation . | Coefficient of variation . | Skewness coefficient . |
---|---|---|---|---|

Rainfall | 556.516 | 58.464 | 0.105 | 0.206 |

Runoff | 203.885 | 55.006 | 0.270 | 0.659 |

Sediment | 1277.245 | 812.960 | 0.636 | 1.534 |

Time series . | Mean . | Standard deviation . | Coefficient of variation . | Skewness coefficient . |
---|---|---|---|---|

Rainfall | 556.516 | 58.464 | 0.105 | 0.206 |

Runoff | 203.885 | 55.006 | 0.270 | 0.659 |

Sediment | 1277.245 | 812.960 | 0.636 | 1.534 |

##### CEEMDAN decomposition

The CEEMDAN method is used to decompose the time series of rainfall, runoff and sediment in the source area of the Yellow River for multi-time scales. The decomposition results are shown in Figures 4–6.

With the CEEMDAN method, the annual runoff, rainfall and sediment data series at Tangnaihai hydrological station from 1966 to 2013 are decomposed into a fifth-order mode, including four IMF components and one residual. It reflects the multi-time scale evolution characteristics of rainfall, runoff and sediment in the source area of the Yellow River. The IMF1 component of each variable has the shortest period and the highest frequency, and the period of other components gradually gets longer and their frequency gradually decreases. The periodic changes of the component time series are shown in Table 2.

Component time series . | Periodic changes (year)/Res changes . | ||
---|---|---|---|

Rainfall . | Runoff . | Sediment . | |

IMF1 | 2–5 | 2–5 | 2–5 |

IMF2 | 5–8 | 6–9 | 5–10 |

IMF3 | 9–11 | 29–30 | 11–30 |

IMF4 | 28 | 32 | 41 |

Res | First reduce and then increase | Reduce | Reduce |

Component time series . | Periodic changes (year)/Res changes . | ||
---|---|---|---|

Rainfall . | Runoff . | Sediment . | |

IMF1 | 2–5 | 2–5 | 2–5 |

IMF2 | 5–8 | 6–9 | 5–10 |

IMF3 | 9–11 | 29–30 | 11–30 |

IMF4 | 28 | 32 | 41 |

Res | First reduce and then increase | Reduce | Reduce |

It can be seen from Table 2 that rainfall, runoff and sediment have four periodic changes. Specifically, rainfall, runoff and sediment all have the same short-period change, and the periodic year is 2–5 years; in the medium period, although the changing periodic years of the three are different, there is little difference, among which the rainfall is 5–8 years, runoff is 6–9 years, and sediment is 5–10 years. There are great differences between rainfall, runoff and sediment in the medium-long period, among which the span of sediment change is large with 11–30 years, 29–30 years for runoff and 9–11 years for rainfall. In terms of the long-period scale, rainfall is 28 years, runoff is 32 years, and sediment is 41 years. The residual component shows the overall nonlinear trend of rainfall, runoff and sediment. The residual component of rainfall showed a decreasing trend from 1966 to 1981, and an increasing trend from 1982 to 2013, but both residual components of runoff and sediment showed a decreasing trend. It can be seen that rainfall, runoff and sediment all have complex multi-time scale periodic change laws, but they have a good correlation in the short and the medium periods. Moreover, runoff and sediment present different periodic changes in the medium-long and long periods, while for their residual components, they show the better synchronization.

### Co-integration analysis

#### Stationary test

The OTS and components of rainfall, runoff and sediment in the source area of the Yellow River are tested by the unit root test. It is assumed that *x _{i}*,

*z*and

_{i}*y*(

_{i}*i*= 0, 1, 2, 3, 4, 5) are used to represent the CEEMDAN component of rainfall, runoff and sediment, and

*x*

_{0},

*z*

_{0}and

*y*

_{0}are their original sequences, respectively. The optimal lag order is determined by the Akaike information criterion (AIC), and the unit root test results are given in Table 3.

Time series . | Variables . | ADF value . | Test type (c, t, k) . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|---|

1% . | 5% . | 10% . | |||||

The original | x_{0} | −0.5138 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No |

y_{0} | −0.8657 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No | |

z_{0} | −0.3647 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No | |

Δx_{0} | −7.4505 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

Δy_{0} | −6.2861 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

Δz_{0} | −6.5489 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

The IMF1 | x_{1} | −7.8693 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{1} | −7.6006 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{1} | −8.3002 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF2 | x_{2} | −10.8127 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{2} | −14.1319 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{2} | −14.2495 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF3 | x_{3} | −14.6845 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{3} | −10.0076 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{3} | −8.6862 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF4 | x_{4} | −26.8800 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

y_{4} | −23.9409 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

z_{4} | −26.7954 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

The residual | x_{5} | −20.3586 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

y_{5} | −13.4521 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

z_{5} | −25.1841 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

Time series . | Variables . | ADF value . | Test type (c, t, k) . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|---|

1% . | 5% . | 10% . | |||||

The original | x_{0} | −0.5138 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No |

y_{0} | −0.8657 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No | |

z_{0} | −0.3647 | (0, 0, 3) | −2.6186 | −1.9485 | −1.6121 | No | |

Δx_{0} | −7.4505 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

Δy_{0} | −6.2861 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

Δz_{0} | −6.5489 | (0, 0, 2) | −2.6186 | −1.9485 | −1.6121 | Yes | |

The IMF1 | x_{1} | −7.8693 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{1} | −7.6006 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{1} | −8.3002 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF2 | x_{2} | −10.8127 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{2} | −14.1319 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{2} | −14.2495 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF3 | x_{3} | −14.6845 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes |

y_{3} | −10.0076 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

z_{3} | −8.6862 | (c, 0, 1) | −3.5812 | −2.9266 | −2.6014 | Yes | |

The IMF4 | x_{4} | −26.8800 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

y_{4} | −23.9409 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

z_{4} | −26.7954 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

The residual | x_{5} | −20.3586 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

y_{5} | −13.4521 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes | |

z_{5} | −25.1841 | (c, t, 1) | −4.1706 | −3.5107 | −3.1855 | Yes |

*Note*: In the test type (c, t, k), c is the intercept item, t is the time trend term (t = 0 means no trend) and k is the optimal lag length.

The ADF test values of the OTS of rainfall, runoff and sediment in the source area of the Yellow River are all larger than the critical value of *t*-test, so they belong to non-stationary time series, but their first-order difference time series are stationary. Meanwhile, their CEEMDAN components are stationary.

#### Co-integration test

Residual sequences . | ADF value . | Test type (c, t, k) . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

u_{0} | −4.6011 | (c, 0, 1) | −3.6105 | −2.9390 | −2.6079 | Yes |

u_{1} | −6.8268 | (c, 0, 1) | −3.6105 | −2.9390 | −2.6079 | Yes |

u_{2} | −7.7600 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{3} | −6.1780 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{4} | −8.7148 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{5} | −5.5610 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

Residual sequences . | ADF value . | Test type (c, t, k) . | Test critical values . | Stationary or not . | ||
---|---|---|---|---|---|---|

1% . | 5% . | 10% . | ||||

u_{0} | −4.6011 | (c, 0, 1) | −3.6105 | −2.9390 | −2.6079 | Yes |

u_{1} | −6.8268 | (c, 0, 1) | −3.6105 | −2.9390 | −2.6079 | Yes |

u_{2} | −7.7600 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{3} | −6.1780 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{4} | −8.7148 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

u_{5} | −5.5610 | (c, 0, 1) | −3.6156 | −2.9411 | −2.6091 | Yes |

In the formula, *u _{t}* represents the residual sequence of the equation, and the data in brackets are the standard deviation of the corresponding coefficient of the equation.

#### Establishing ECM

In the formula, *ecm _{t}*(−1) represents the error correction term, and the coefficient before

*ecm*(−1) is the short-period adjustment coefficient, and the coefficient before the difference terms of each variable represents the short-period dynamic change of the model.

_{t}It can be seen that the rainfall, runoff and sediment in the source area of the Yellow River show a long-term equilibrium relationship. The component time series also has a long-term equilibrium relationship at different time scales, and the error correction term coefficients of all equations are all negative, which is consistent with the reverse correction mechanism. It can be seen from Equation (17) that runoff is not only affected by rainfall and sediment but also by the deviation of runoff from the equilibrium level in the previous year. The coefficients of Δ*x*_{0} and Δ*y*_{0} are 0.23665 and 0.03767, respectively, which indicates that the short-term influence of rainfall and sediment on runoff in the source area of the Yellow River is different, and the influence of rainfall is stronger than that of sediment. The coefficient before *ecm _{t}*(−1) is −0.70057, which indicates that the deviation of runoff from equilibrium in this year will be adjusted by 70.06% in the next year.

#### Annual runoff forecast

The ECM-OTS and the ECM-CEEMDAN models of the annual runoff are established by using the measured data series of rainfall, runoff and sediment from 1966 to 2005, and the forecast test is conducted with the measured data series from 2006 to 2013. Figure 6 shows the fitting between the measured value and the fitted value of the two models. Figure 7 shows the relative error between the fitting value and the measured value of the two models. Table 6 shows the forecasted values and relative errors of the two models during the forecast period.

It can be seen from Figure 7 that both models can well describe the dynamic equilibrium relationship between rainfall, runoff and sediment in the source area of the Yellow River. Moreover, the accuracy of runoff fitting value of the ECM-CEEMDAN model is better than that of ECM-OTS.

It can be seen from Figure 8 that in the year that the relative error is greater than 20% from 1967 to 2005, the ECM-CEEMDAN model has only one 28.11% in 2002, but the ECM-OTS model has two years, 20.83% in 1997 and 32.17% in 2002. The average relative error of the ECM-CEEMDAN model is 6.21%, which is 1.42% lower than the 7.63% of the ECM-OTS model. It can be seen that the ECM-CEEMDAN model has better fitting accuracy.

In the formula, is the deterministic coefficient, is the measured value, is the forecasted value, is the mean of the measured values, and *n* is the length of the sequence.

The accuracy of runoff forecast is divided into three grades according to the qualification rate or the deterministic coefficient, as shown in Table 5.

Accuracy class . | A . | B . | C . |
---|---|---|---|

Pass rate/% | QR ≥ 85 | 85 > QR ≥ 70 | 70 > QR ≥ 60 |

Deterministic coefficient | DC > 0.9 | 0.9 ≥ DC > 0.7 | 0.7 > DC ≥ 0.5 |

Accuracy class . | A . | B . | C . |
---|---|---|---|

Pass rate/% | QR ≥ 85 | 85 > QR ≥ 70 | 70 > QR ≥ 60 |

Deterministic coefficient | DC > 0.9 | 0.9 ≥ DC > 0.7 | 0.7 > DC ≥ 0.5 |

It can be seen from Table 6 that for the ECM-OTS model, in the forecasted 8 years of 2006–2013, only the relative error of runoff in 2009 exceeded 20%, and its forecasted qualified rate was 87.5%, reaching the level A. Meanwhile, for all predicted years, the relative error of runoff forecast that is less than 10% is 5 years, accounting for 62.5%. While for the ECM-CEEMDAN model, its forecasted relative error in 2009 only is 18.81% which is close to 20%. The whole forecasted qualified rate was 100%. Although the ECM-CEEMDAN model has the same as the ECM-OTS model, with the runoff forecast relative error of 10% in 5 years, its relative error value tends to be smaller on the whole, which indicates that the overall forecast accuracy of the ECM-CEEMDAN model is better. Moreover, the average relative error of the ECM-CEEMDAN model is 8.59%, which is 2.7% lower than 11.29% of the ECM-OTS model. This shows that the ECM-CEEMDAN model has a higher forecast accuracy than the ECM-OTS model.

Year . | Measured value (10^{8} m^{3})
. | ECM-OTS model . | ECM-CEEMDAN model . | ||
---|---|---|---|---|---|

Predicted value (10^{8} m^{3})
. | Relative error (%) . | Predicted value (10^{8} m^{3})
. | Relative error (%) . | ||

2006 | 141.26 | 164.72 | 16.60 | 157.08 | 11.19 |

2007 | 189.04 | 177.12 | 6.31 | 181.34 | 4.07 |

2008 | 174.60 | 157.93 | 9.55 | 165.21 | 5.37 |

2009 | 263.48 | 197.06 | 25.21 | 213.92 | 18.81 |

2010 | 197.08 | 210.72 | 6.92 | 209.32 | 6.21 |

2011 | 211.21 | 198.11 | 6.20 | 193.63 | 8.32 |

2012 | 284.04 | 232.62 | 18.10 | 249.97 | 11.99 |

2013 | 194.64 | 191.92 | 1.40 | 189.24 | 2.77 |

Year . | Measured value (10^{8} m^{3})
. | ECM-OTS model . | ECM-CEEMDAN model . | ||
---|---|---|---|---|---|

Predicted value (10^{8} m^{3})
. | Relative error (%) . | Predicted value (10^{8} m^{3})
. | Relative error (%) . | ||

2006 | 141.26 | 164.72 | 16.60 | 157.08 | 11.19 |

2007 | 189.04 | 177.12 | 6.31 | 181.34 | 4.07 |

2008 | 174.60 | 157.93 | 9.55 | 165.21 | 5.37 |

2009 | 263.48 | 197.06 | 25.21 | 213.92 | 18.81 |

2010 | 197.08 | 210.72 | 6.92 | 209.32 | 6.21 |

2011 | 211.21 | 198.11 | 6.20 | 193.63 | 8.32 |

2012 | 284.04 | 232.62 | 18.10 | 249.97 | 11.99 |

2013 | 194.64 | 191.92 | 1.40 | 189.24 | 2.77 |

Furthermore, from the deterministic coefficient of runoff forecast, the DC value of the ECM-OTS model is 0.842, which is the level B, while the DC value of the ECM-CEEMDAN model is 0.901, reaching the level A. This shows that the ECM-CEEMDAN model has the higher degree of agreement between the runoff forecasting process and the measured process.

## CONCLUSION

- (1)
The CEEMDAN method can reveal the periodic characteristics of rainfall, runoff and sediment on the multi-time scales in the source area of the Yellow River. These three variables have a good correlation in the short and the medium periods. In addition, runoff and sediment show a better synchronization in the trend item, which reveals the law of periodic fluctuations of rainfall, runoff and sediment.

- (2)
With the co-integration theory and ECM, the ECM-OTS model and the ECM-CEEMDAN model are established. They can reveal the long-term equilibrium and short-term fluctuations of the original sequence and component sequence of rainfall, runoff and sediment in the source area of the Yellow River, and can also effectively forecast the runoff in this source area.

- (3)
Both the ECM-OTS model and the ECM-CEEMDAN model can well describe the dynamic equilibrium relationship between rainfall, runoff and sediment in the source area of the Yellow River. However, the forecast period error of the ECM-CEEMDAN model is less than 20%, and its forecast qualified rate can reach 100%, and the accuracy reaches the level A. Compared with the ECM-OTS model, it has the better forecasting accuracy, which provides a new and more accurate runoff forecasting method.

Although the ECM-CEEMDAN of rainfall, runoff and sediment has the higher prediction accuracy, rainfall, runoff and sediment in practice they are often showing the nonlinear relations affected by other many factors, such as underlying surface. More efforts are needed to reveal the nonlinear relations between rainfall, runoff and sediment. Anyway, the combination of the CEEMDAN method and co-integration theory in this paper provides a better analysis method to reveal the internal periodic changes of the hydrological variable, so it can better reflect the actual characteristic of the hydrological variable, which is available for the hydrological forecasting and water resources management.

## ACKNOWLEDGEMENTS

This research is supported by the National Key R&D Program of China (Grant No. 2018YFC0406501), Program for Innovative Talents (in Science and Technology) at University of Henan Province (Grant No. 18HASTIT014) and Foundation for University Youth Key Teacher of Henan Province (Grant No. 2017GGJS006).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.