## Abstract

With increasingly severe climate changes and intensified human activities, it is more and more difficult to predict the non-stationary extreme runoff series accurately. In this research, based on the ‘decomposition-prediction-reconstruction’ model, an instantaneous frequency distribution map was used to measure the effect of empirical mode decomposition (EMD), ensemble empirical mode decomposition, complete ensemble empirical mode decomposition and extreme-point symmetric mode decomposition (ESMD) in dealing with mode mixing; appropriate prediction methods for each component were selected to form a combined prediction model; and the advantages of a combined prediction model based on ESMD were compared and analyzed with the following results acquired: (1) ESMD can address the mode mixing problem with EMD; (2) particle swarm optimization-least squares support vector machine, autoregressive model (1) and random forest are suitable for high-/medium-/low-frequency components and the residual components R; (3) the results of the combined prediction model are better than those of the single ones; and (4) the prediction effect of the combined prediction model is the best under ESMD decomposition, and the prediction errors of the runoff extreme value sequence can be reduced by about 58–80% compared with the three other decomposition methods. Moreover, as demonstrated in this study, the combined prediction model based on ESMD can effectively predict the non-stationary extreme runoff series, while providing reference for forecasting other non-stationary time series.

## INTRODUCTION

Accurate prediction of runoff extreme value series is of important scientific and practical significance for planning and design of flood controls, prevention and mitigation of disasters, protection of ecological environments and sustainable development of economy and society. In recent years, however, due to climate change and human activities, runoff extreme series have become non-linear and non-stationary and thus more difficult to predict (Zhang *et al.* 2016b). Therefore, their accurate prediction of them has attracted wide attention from researchers.

So far, many new methods and technologies have been introduced to try to predict the non-stationary time series. In particular, the idea of coupled decomposition-prediction-reconstruction has been widely recognized by the industry because of its good response to random, periodic and trend terms of hydrological processes and its superior prediction accuracy compared to traditional methods (Yu *et al.* 2018). In the decomposition-prediction-reconstruction model, decomposition is the premise and key. Since Huang (1998) proposed an empirical mode decomposition (EMD) method in 1998, EMD decomposition algorithm has been widely used in meteorology, acoustics, biology, earthquake and mechanical vibration (Huang *et al.* 2003). However, because EMD algorithm is based on experience, it has certain shortcomings in applications, with the main problem existing in mode mixing, which refers to the case that the decomposition of a single component contains a large difference in the characteristic time scale, or the adjacent two components show similar time scales. The phenomenon of mode aliasing in decomposition means that the time-frequency spectrum of components is mixed, with each component losing its uniqueness, leading to the result that decomposition does not achieve the purpose of separating different signals. To solve this problem, many improved EMD-based decomposition methods have emerged, such as the ensemble empirical mode decomposition (EEMD) proposed by Huang *et al.* (1999), the complete ensemble empirical mode decomposition (CEEMD) by Yeh *et al.* (2010) and the extreme-point symmetric mode decomposition (ESMD) method by Wang & Li (2015). Among them, both EEMD and CEEMD add white noises with normal distribution (i.e., signals with continuous and uniform spectrum) to the original signals several times and change the extreme value distribution characteristics of the signal by using the frequency uniform distribution characteristics of white noises to solve the problem of mode mixing. The added white noise sequence, however, will ‘pollute’ the original sequence signal, and if the parameters are not selected properly, not only will the mode mixing not be suppressed, but pseudo components will also appear in the decomposition results (Wang & Li 2013; Chen *et al.* 2015; Zhao *et al.* 2015). On the other hand, by using internal pole symmetry direct interpolation instead of external envelope interpolation, the ESMD method introduces the concept of optimal adaptive global curve to optimize the trend line of decomposition to determine the optimal number of modal decomposition, thus overcoming the shortcomings of the former two methods (Wang & Li 2015).

The address the problem of mode mixing in EMD, many scholars have proposed improved methods based on EMD, and analyses have been carried out on how to express the degree of mode mixing. Modal decomposition is purposed to decompose the original sequence into a finite number of independent and representative modes. To find out whether there is mode mixing among components, it is necessary to analyze the independence of each component. To this end, existing studies usually measure the degree of mode mixing among components from three aspects: orthogonality index (IOO) (Cao *et al.* 2019), correlation coefficient (Wang & Zhang 2017) and error analysis (Wang *et al.* 2015). Nevertheless, these three methods are not fully applicable to characterizing the independence of each component. The orthogonality and linear independence are only applicable to modal functions with constant frequency and stable amplitude. Wang & Li (2015) thought that the independence of modes was mainly manifested in the instantaneous difference of frequencies, i.e., if the frequencies of each mode do not overlap concurrently, it can be decided that no mode mixing exist; in such case, a direct interpolation method can be used to obtain the instantaneous frequency distribution curve of each component.

Another key part of the decomposition-prediction-reconstruction model is prediction. In previous studies, models have been coupled with decomposition methods to make prediction, including back-propagation neural network, radial basis function (RBF) neural network, autoregressive moving average model (ARMA), autoregressive model (AR), support vector machine (SVM) model and grey model (GM) (1,1). For example, Zhang *et al.* (2016a) used RBF neural network to predict the components of EEMD decomposition; and Zhao *et al.* (2017) tried to predict the runoff components of EMD decomposition in combination with the chaotic least squares SVM. Some scholars also believe that different frequency distribution components are suitable for different prediction methods; accordingly, different prediction methods shall be used for different components. For example, Zhao *et al.* (2014) thought that RBF neural network was suitable for high-frequency component prediction, ARMA for low-frequency component prediction and the GM (1,1) model for trend item. Yu *et al.* (2018) applied the AR model to low-frequency components and the RBF neural network to high-frequency components. Wang *et al.* (2010) applied the autoregressive model, the rank set pair prediction model and the polynomial fitting equation to high-frequency, low-frequency and residual components of EMD decomposition, respectively. In summary, due to different characteristics of modal functions in different time series, the prediction methods for different components are not uniform; therefore, further exploration is needed. In this research, to improve the prediction accuracy, EMD, EEMD, CEEMD and ESMD were used to decompose the runoff extreme sequence; the instantaneous frequency distribution curve of components was used to measure the effect of four decomposition methods in dealing with mode mixing; and then the prediction method suitable for the components of runoff extreme series was explored. Finally, a combined prediction model composed of a single prediction method was used to predict and analyze the runoff extreme series. The main flow chart of this research is shown in Figure 1.

## MATERIALS AND METHODS

### Study area and data

The annual runoff extreme series measured at Bajiahu hydrological stations in the Jingou River Basin from 1957 to 2016 (a total of 60 years) were taken as the research object of this study. Located in Shawan County, Xinjiang (85°22′E–85°44′E, 43°55′N–44°28′N) and covering a drainage area of about 2,626 km^{2} (Figure 2) as an inland dry river with mountain streams and sediments, the Basin has a total annual runoff of about 3.83 × 10^{8} m^{3}. Its inner-year variation of runoff is quite different, but the inter-year variation is relatively stable. The runoff from June to August accounts for 69.7% of the annual runoff. The Basin is located in the north slope of Tianshan Mountain in Xinjiang, along the Jingou River, a typical glacial snowmelt river. Due to climate change, plus the characteristics of uneven distribution of runoff in time and space, floods and spring droughts caused by extreme runoffs often occur in the basin.

### Data source

Based on the monthly runoff data, the runoff extreme series from 1957 to 2016 are selected as the first largest/smallest order statistic of the year, i.e., the maximum runoff series and minimum runoff series composed of the first largest monthly runoff and the first smallest monthly runoff, respectively, are selected for every year. The monthly runoff data comes from Shihezi Hydrological and Water Resources Survey Bureau and Planning Bureau of Water Resources Department of Xinjiang Uygur Autonomous Region.

## METHODS

### ESMD method

Compared with traditional EMD decomposition methods, the ESMD method has unique advantages: it uses the pole symmetric midpoint for internal interpolation and obtains the appropriate number of interpolation curves based on different conditions; and the number of decomposition termination extreme points can be customized, which is conducive to obtaining the optimal global mean in the sense of least squares (Zhang 2018). The implementation steps of ESMD are as follows:

Step 1: Find out all extreme points (maximum points and minimum points) of data

*Y*(*t*) and record them as .Step 2: Connect the adjacent poles with line segments and record the line segments as .

Step 3: Supplement the left and right boundary middle points and by using the linear interpolation method.

Step 4: Use the obtained

*n*+ 1 midpoints to construct*p*interpolation lines and calculate their mean curves .Step 5: Repeat the above steps for until ( is a preset allowable error) or the number of screening times reaches the preset maximum

*K*, and then the first mode*M*_{1}(*t*) is decomposed.Step 6: Repeat the above steps for to obtain

*M*_{2}(*t*),*M*_{3}(*t*)…, until the final residual*R*(*t*) has only a certain number of poles.Step 7: Let the maximum screening times

*K*change in the integer interval and repeat the steps above to get a series of decomposition results, then calculate the variance ratio , and draw its variation diagram with*K*, where and are the relative standard deviation of and the standard deviation of the original data , respectively.Step 8: Select from the interval the maximum numbers of screening times

*K*corresponds to the minimum variance ratio (which means that is the best fitting curve of data), and repeat the first six steps to output the decomposition results._{0}

### Direct interpolation method

The method of direct interpolation to draw instantaneous frequency distribution maps is based on the average frequency of local period as interpolation points to generate a smooth curve. The basic idea is as follows (Wang & Li 2015):

Step 1: Find the extreme point and calculate the time difference between the two adjacent maximum points and the adjacent minimum points.

Step 2: Regard the time period obtained in Step 1 as a local period and assign it to a point, and then draw the time-period correspondence graph.

Step 3: Reciprocate the local periodic values to obtain local frequencies, and then use the cubic spline interpolation to obtain smooth time-frequency curves (if there is an equivalent segment in the modal, its frequency is directly defined as zero).

### Combined prediction model

Due to inconsistent frequency distribution and complexity of each component after modal decomposition, there will be a large error in using the same prediction method to predict each component. First, three prediction methods, i.e., Particle swarm optimization–least squares support vector machine (PSO–LSSVM), random forest (RF) and AR(1), were used to predict and analyze the components of different frequencies. Then, the root mean square error (RMSE), the mean absolute percentage error (MAPE) and the mean absolute error (MAE) were used to determine the suitable prediction methods for different components. Finally, each component was predicted according to its most suitable prediction method, forming a combined prediction model.

## RESULTS

### Decomposition of maximum runoff series

EMD, EEMD, CEEMD and ESMD decomposition methods were used to decompose the maximum series of runoff in Jingou River Basin from 1957 to 2016. The intrinsic mode function (IMFs) and a trend R were obtained. The decomposition results of these methods are shown in Figure 3, which demonstrates that a series of IMFs from high to low frequencies and a trend R were obtained after the maximum series of runoff were decomposed by four decomposition methods. Among them, EMD, EEMD and CEEMMD decomposition results had five modes (IMF1–IMF5), and ESMD decomposition results had three modes (IMF1–IMF3), indicating that the number of modes of ESMD decomposition results is smaller than that of EMD, EEMD and CEEMMD decomposition results. Trend R of the four decomposition methods can also reflect the weak increasing trend of runoff maximum sequence, but R of EMD, EEMD and CEEMD decomposition methods only has one extreme point at most. Such trend function can only reflect the global change of maximum runoff to a certain extent. By contrast, the R obtained by the ESMD decomposition method has multiple extreme points, so it can better reflect the overall trend of maximum runoff sequence.

To understand the degree of mode mixing of decomposition results more intuitively, the direct interpolation method was used to draw the frequency distribution of decomposition results, as shown in Figure 4, which demonstrates that the frequency distribution curves of IMFs overlap (i.e. the frequency crossover between adjacent modes at the same time) in the EMD decomposition results, indicating that there is a mixing problem among the modes. The EEMD decomposition method alleviates the mode mixing problem of EMD to some extent. There is no frequency crossover between IMF1 and IMF2 or between IMF4 and IMF5, but the frequency distribution curve of IMF3 has intersection with the frequency distribution of the other four components. The frequency cross degree of each mode in CEEMD decomposition method is smaller than that in EEMD decomposition method. Except for three cross points between IMF4 and IMF5, the frequency distribution curves of the other modes are independent of each other. In the results of the ESMD decomposition method, the frequency distribution curves of the three modes do not cross with each other, showing that the maximum sequence of runoff is fully decomposed by the ESMD decomposition method.

Combining the modal frequency distribution maps of each decomposition method shows that in EMD decomposition, the degree of mode mixing is large and the number of true components needs to be determined twice, making the decomposition inefficient twice, so the decomposition is not efficiency. EEMD can alleviate the mode aliasing of EMD to some extent, but the amplitude of the added white noise signal needs to be determined before decomposition. If the amplitude of white noise is too small, the phenomenon of modal aliasing will not be improved; on the other hand, if the amplitude of white noise is too large, the original signal will be polluted, the signal-to-noise ratio of decomposition results will be reduced and the integrity of decomposition is poor. Although CEEMD overcomes the problem of EEMD white noise residual by adding noise with n-pair opposite signs and same amplitude in the original data, CEEMD is similar to EEMD. The noise amplitude also needs to be determined before CEEMD decomposition. In fact, the CEEMD algorithm executes EEMD twice, and thus the operation amount is doubled. Thus, the operation amount is doubled. ESMD is an improvement based on EMD by using internal symmetric interpolation instead of external winding interpolation, and uses the idea of least squares to optimize the final remaining modes. From the frequency distribution map, it can be seen that ESMD effectively solves the problem of mode mixing (or frequency crossover) in EMD decomposition, and it can be used as a feasible method for various time series signal analyses.

### Prediction analysis

#### Selection of appropriate forecasting methods for each component

On the basis of the variation characteristics, frequency and amplitude of each component in the decomposition results, the single prediction methods suitable for each component were selected to construct the combined prediction model. The errors of IMF1–IMF4 and trend R decomposed by ESMD under different prediction methods (Table 1).

As seen in Table 1, for IMF1 (high-frequency modal component), the prediction error of the PSO–LSSVM method with strong adaptability is the best because of its large fluctuation and complex characteristics. The periodicity of IMF2–IMF3 (intermediate and low-frequency modal components) is obvious. Analyses show that the errors of the AR(1) prediction model were smaller than those of the other two methods, indicating that AR(1) is more suitable for the prediction of medium- and low-frequency modal components. The error of trend R is the smallest under the RF prediction method, demonstrating that for relatively flat trend R, the RF prediction method has the best prediction effect. Based on the above analyses, the PSO–LSSVM prediction method was selected in this research to predict high-frequency components, the AR(1) prediction model to predict medium- and low-frequency components, and the RF prediction method to predict trend R.

Prediction method . | IMF1 . | IMF2 . | IMF3 . | R . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | |

RF | 0.199 | 0.816 | 0.153 | 0.022 | 0.605 | 0.017 | 0.072 | 0.596 | 0.063 | 0.105 | 0.096 | 0.104 |

PSO–LSSVM | 0.107 | 0.793 | 0.182 | 0.091 | 3.191 | 0.080 | 0.123 | 1.295 | 0.121 | 0.132 | 0.120 | 0.130 |

AR(1) | 0.177 | 0.930 | 0.156 | 0.011 | 0.318 | 0.009 | 0.017 | 0.174 | 0.016 | 1.000 | 0.927 | 1.000 |

Prediction method . | IMF1 . | IMF2 . | IMF3 . | R . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | |

RF | 0.199 | 0.816 | 0.153 | 0.022 | 0.605 | 0.017 | 0.072 | 0.596 | 0.063 | 0.105 | 0.096 | 0.104 |

PSO–LSSVM | 0.107 | 0.793 | 0.182 | 0.091 | 3.191 | 0.080 | 0.123 | 1.295 | 0.121 | 0.132 | 0.120 | 0.130 |

AR(1) | 0.177 | 0.930 | 0.156 | 0.011 | 0.318 | 0.009 | 0.017 | 0.174 | 0.016 | 1.000 | 0.927 | 1.000 |

#### Comparison between combined prediction method and single prediction method

In order to verify the predictive performance of the combined prediction model, the maximum series components of runoff in the Jingou River Basin under different decomposition methods are predicted by single prediction method and a combined prediction model, with results shown in Figure 5 and Table 2. Figure 5 shows that the results of single prediction model in individual years were better than those of the combined prediction model by synthetically analyzing the forecasting results of four forecasting methods. This may be because the proportion of each component in the maximum runoff is different in different years, and different prediction methods have different prediction performances for components with different characteristics and so the prediction accuracy of a single prediction method is different for different years. Although the combined model may weaken the prediction effect of the single prediction model for individual years, the prediction effect of the combined model will be greatly improved for the years when the single prediction model has poor prediction effect. Therefore, from the perspective of fitting effect of the original sequence, in general, the prediction effect of the combined model is better than that of the single prediction method.

Prediction method . | EMD . | EEMD . | CEEMD . | ESMD . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | |

RF | 0.277 | 0.258 | 0.242 | 0.271 | 0.230 | 0.227 | 0.237 | 0.209 | 0.199 | 0.238 | 0.217 | 0.210 |

PSO–LSSVM | 0.213 | 0.188 | 0.194 | 0.239 | 0.201 | 0.200 | 0.241 | 0.144 | 0.160 | 0.213 | 0.184 | 0.176 |

AR(1) | 0.385 | 0.394 | 0.347 | 0.346 | 0.304 | 0.242 | 0.250 | 0.247 | 0.220 | 0.161 | 0.162 | 0.139 |

Combination forecasting | 0.216 | 0.210 | 0.174 | 0.191 | 0.170 | 0.136 | 0.162 | 0.144 | 0.114 | 0.030 | 0.026 | 0.021 |

Prediction method . | EMD . | EEMD . | CEEMD . | ESMD . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | RMSE . | MAPE . | MAE . | |

RF | 0.277 | 0.258 | 0.242 | 0.271 | 0.230 | 0.227 | 0.237 | 0.209 | 0.199 | 0.238 | 0.217 | 0.210 |

PSO–LSSVM | 0.213 | 0.188 | 0.194 | 0.239 | 0.201 | 0.200 | 0.241 | 0.144 | 0.160 | 0.213 | 0.184 | 0.176 |

AR(1) | 0.385 | 0.394 | 0.347 | 0.346 | 0.304 | 0.242 | 0.250 | 0.247 | 0.220 | 0.161 | 0.162 | 0.139 |

Combination forecasting | 0.216 | 0.210 | 0.174 | 0.191 | 0.170 | 0.136 | 0.162 | 0.144 | 0.114 | 0.030 | 0.026 | 0.021 |

Table 2 shows that the error of the PSO–LSSVM method is the smallest among the three decomposition methods of EMD, EMD and CEEMD, and that of the AR(1) model is the smallest in the component prediction of ESMD. This shows that the high-frequency components account for a large proportion in the three decomposition results of EMD, EMD and CEEMD, while the trend R accounts for a large proportion in the ESMD decomposition results. It also proves the advantages of the single prediction method in predicting different frequency distribution components. In addition, except for EMD decomposition method, the error analysis results of the other three decomposition methods show that the prediction error of the combined model is less than that of the single prediction method, demonstrating that the combined model has not only delivered the advantages of the single prediction method, but also addressed the shortcomings of the single prediction method and improved the prediction accuracy of the single prediction method. In relation to the prediction errors of EMD components, the RMSE and MAPE errors of the PSO–LSSVM prediction method are the smallest, indicating that the high-frequency components account for a large proportion of EMD decomposition results, a result consistent with the conclusion that IMF2–IMF5 and R in the EMD decomposition method overlap with high-frequency components, as shown in Figure 4.

#### Prediction results of combined methods with different decomposition methods

To verify the decomposition effect of the four decomposition methods, the prediction effect of the combined prediction model under the four decomposition methods was sorted out as shown in Figure 6 and Table 3. As can be seen from Figure 6, the combined prediction model under ESMD decomposition method has the best fitting effect with the original sequence, followed by the EEMD and CEEMD decomposition methods, with EEMD decomposition method rendering the worst prediction effect. The error data in Table 3 also show that the prediction error of the ESMD decomposition method is much smaller than that of other three decomposition methods. The prediction error of the combined prediction model under four decomposition methods can be ordered as EMD > EEMD > CEEMD > ESMD from large errors to small ones. This result is consistent with the order of decomposition effect of all the decomposition method mentioned above. As shown, sufficient decomposition of the original sequence is an important prerequisite for accurate prediction, and it also verifies the advantages of ESMD decomposition method in dealing with mode mixing. In addition, the three kinds of prediction errors under ESMD decomposition were 0.030, 0.026 and 0.021, which were 81–88% less than the other three decomposition methods, meeting the accuracy requirements of runoff prediction. It also shows that the combination prediction idea under ESMD decomposition can effectively improve the prediction accuracy.

Methods . | RMSE . | MAPE . | MAE . |
---|---|---|---|

EMD | 0.216 | 0.210 | 0.174 |

EEMD | 0.191 | 0.170 | 0.136 |

CEEMD | 0.162 | 0.145 | 0.115 |

ESMD | 0.030 | 0.026 | 0.021 |

Methods . | RMSE . | MAPE . | MAE . |
---|---|---|---|

EMD | 0.216 | 0.210 | 0.174 |

EEMD | 0.191 | 0.170 | 0.136 |

CEEMD | 0.162 | 0.145 | 0.115 |

ESMD | 0.030 | 0.026 | 0.021 |

### Verification analysis

To verify the stability of the ESMD decomposition method and combined prediction model, the minimum sequence of the Jingou River Basin runoff was decomposed using EMD, EEMD, CEEMD and ESMD methods. The decomposition results are shown in Figure 7. Similar to the decomposition results of maximum runoff sequence, EMD, EEMD and CEEMD methods can decompose minimum runoff sequence into five components and trend R, while the ESMD method can decompose minimum runoff sequence into three components and trend term R.

The direct interpolation method was used to plot the homeopathic frequency distribution of minimum runoff sequence components under four decomposition methods and the results shown in Figure 8, which demonstrates that the frequency curves of each component of the minimum runoff sequence mostly cross under EMD decomposition than under EEMD and CEEMD decomposition, and there is no cross point on the frequency distribution curves of each component under ESMD decomposition. Similarly, the ESMD decomposition method avoids mode mixing in the decomposition of minimum runoff sequence.

The combined prediction model is applied to forecast each component of Jingou River's minimum runoff series. In combination of forecasting results of the single prediction model (Table 4), it is found that the combined prediction model has the smallest error and the best prediction effect under the four decomposition methods. Components under different decomposition methods are predicted using the combined model, with the results shown in Figure 9 and Table 5, which are similar to those of maximum runoff series. In addition, when the combined prediction model was used, the prediction results of ESMD decomposition method were better than those of the original series. The order of prediction effect under the four decomposition methods is EMD > EEMD > CEEMD > ESMD. The three prediction errors of ESMD were 0.003, 0.045 and 0.002, which were 54–63% less than the other three methods.

Methods . | Errors . | RF . | PSO–LSSVM . | AR(1) . | Combination forecasting . |
---|---|---|---|---|---|

EMD | RMSE | 0.012 | 0.013 | 0.014 | 0.007 |

MAPE | 0.217 | 0.154 | 0.223 | 0.123 | |

MAE | 0.012 | 0.009 | 0.013 | 0.006 | |

EEMD | RMSE | 0.016 | 0.013 | 0.021 | 0.007 |

MAPE | 0.249 | 0.139 | 0.349 | 0.115 | |

MAE | 0.014 | 0.008 | 0.019 | 0.006 | |

CEEMD | RMSE | 0.013 | 0.015 | 0.011 | 0.007 |

MAPE | 0.145 | 0.160 | 0.127 | 0.100 | |

MAE | 0.009 | 0.010 | 0.008 | 0.006 | |

ESMD | RMSE | 0.013 | 0.013 | 0.016 | 0.003 |

MAPE | 0.245 | 0.166 | 0.269 | 0.045 | |

MAE | 0.013 | 0.010 | 0.014 | 0.002 |

Methods . | Errors . | RF . | PSO–LSSVM . | AR(1) . | Combination forecasting . |
---|---|---|---|---|---|

EMD | RMSE | 0.012 | 0.013 | 0.014 | 0.007 |

MAPE | 0.217 | 0.154 | 0.223 | 0.123 | |

MAE | 0.012 | 0.009 | 0.013 | 0.006 | |

EEMD | RMSE | 0.016 | 0.013 | 0.021 | 0.007 |

MAPE | 0.249 | 0.139 | 0.349 | 0.115 | |

MAE | 0.014 | 0.008 | 0.019 | 0.006 | |

CEEMD | RMSE | 0.013 | 0.015 | 0.011 | 0.007 |

MAPE | 0.145 | 0.160 | 0.127 | 0.100 | |

MAE | 0.009 | 0.010 | 0.008 | 0.006 | |

ESMD | RMSE | 0.013 | 0.013 | 0.016 | 0.003 |

MAPE | 0.245 | 0.166 | 0.269 | 0.045 | |

MAE | 0.013 | 0.010 | 0.014 | 0.002 |

Methods . | RMSE . | MAPE . | MAE . |
---|---|---|---|

EMD | 0.0069 | 0.1233 | 0.0065 |

EEMD | 0.0068 | 0.1145 | 0.0062 |

CEEMD | 0.0070 | 0.0996 | 0.0057 |

ESMD | 0.0029 | 0.0453 | 0.0024 |

Methods . | RMSE . | MAPE . | MAE . |
---|---|---|---|

EMD | 0.0069 | 0.1233 | 0.0065 |

EEMD | 0.0068 | 0.1145 | 0.0062 |

CEEMD | 0.0070 | 0.0996 | 0.0057 |

ESMD | 0.0029 | 0.0453 | 0.0024 |

In summary, compared with other decomposition methods, the ESMD decomposition method avoids the mode mixing problem, and the combined prediction model based on ESMD can deliver a better performance and meet the prediction requirements.

## CONCLUSIONS

Given that most of the existing decomposition methods are prone to mode mixing, the time-frequency distribution of components was drawn in this research using the direct interpolation method to directly judge the degree of mode mixing in the decomposition methods. The decomposition effects of the four decomposition methods, EMD, EEMD, CEEMD and ESMD, were analyzed. Based on the variation characteristics of each component, a combined prediction model was proposed and compared to the single prediction models. The main conclusions include the following:

- (1)
Based on the time-frequency distribution of components, the components' mode mixing of EMD is the most serious in the four decomposition methods. EEMD and CEEMD decomposition methods can alleviate the problem of mode mixing to some extent, but the effect is unstable. Compared with EMD, EEMD and CEEMD decomposition methods, the ESMD decomposition method can remove the occurrence of mode mixing and also decompose the time series more completely.

- (2)
Three prediction methods, PSO–LSSVM, RF and AR(1) were used to predict each component. The results show that PSO–LSSVM is suitable for predicting high-frequency components, AR(1) for predicting medium- and low-frequency components, and RF for predicting trend R.

- (3)
Compared with the single prediction model, the combined prediction model has the smallest error in each component-predicting process (except for the EMD decomposition method for maximum runoff sequence), and its prediction performance is superior to that of the single prediction models.

- (4)
The ESMD decomposition method generates the best prediction results and the smallest errors. Under ESMD decomposition, the prediction error of maximum runoff sequence can be reduced by more than 80% compared with the other three decomposition methods, and the prediction error of minimum runoff sequence can be reduced by more than 54%.

In conclusion, the combined prediction model based on the ESMD decomposition method can effectively predict the extreme value series of non-stationary runoff under changing environment, improve the accuracy of forecasting results and provide an effective reference for flood control and disaster reduction measures to be formulated for river basin projects.

## FUNDING SUPPORT

1. Funded by the China National Natural Science Foundation (project no. 51569032).

2. The Key Discipline Research Project of Water Conservancy Engineering of Xinjiang Agricultural University (grant no. SLXK-YJS-2018-07).