Abstract
In runoff prediction, the prediction accuracy is often affected by the non-linear and non-stationary characteristics of the runoff series. In this study, a coupled forecasting model is proposed that decomposes the original runoff series by an improved complete ensemble Empirical Mode Decomposition (EMD) (ICEEMDAN) combined with a wavelet decomposition (WD) and then forecasts the monthly runoff using a support vector machine (SVM) optimized by the seagull optimization algorithm (SOA). In this method, a series of Intrinsic Mode Function (IMF) and a Residual (Res) are obtained by decomposing the original runoff series with ICEEMDAN. The WD method is used to perform quadratic decomposition of high-frequency components decomposed by the ICEEMDAN method to make the runoff series as smooth as possible. Then the decomposed components are input into the SOA-SVM model for prediction. Finally, the prediction results of each component are superimposed and reconstructed to obtain the final monthly runoff prediction results. RMSE, Mean Absolute Percentage Error (MAPE), Nash-Sutcliffe Efficiency Coefficient (NSEC), and R are selected to evaluate the prediction results and the model is compared with SOA-SVM model, EMD-SOA-SVM model and CEEMDAN-SOA-SVM model other models. The proposed model is applied to the monthly runoff forecast of the Hongjiadu and Manwan Reservoirs. When compared with other benchmarking models, the ICEEMDAN-WD-SOA-SVM model attains the smallest Root Mean Square Error (RMSE) and MAPE and the largest NSEC and R. The ICEEMDAN-WD-SOA-SVM model has the best prediction effect, the highest prediction accuracy, and the lowest prediction error.
HIGHLIGHTS
The ICEEMDAN–WD model is used to decompose the original runoff series.
The proposed ICEEMDAN–WD model can effectively reduce the complexity of the runoff series.
The proposed SOA–SVM model can effectively improve the prediction accuracy of runoff series.
The proposed model can provide high prediction accuracy and consistency.
ACRONYMS LIST
- ANN
Artificial Neural Network
- ARMA:
Auto-Regressive and Moving Average Model
- EMD:
Empirical Mode Decomposition
- EEMD:
Ensemble Empirical Mode Decomposition
- ELM:
Extreme Learning Machine
- GBRT:
Gradient Boosting Regression Tree
- IMF:
Intrinsic Mode Function
- LSSVM:
Least Squares Support Vector Machine
- LSTM:
Long Short-Term Memory
- MAPE:
Mean Absolute Percentage Error
- NSEC:
Nash-Sutcliffe Efficiency Coefficient
- R:
correlation coefficient
- Res:
Residual
- RMSE:
Root Mean Square Errorl
- VMD:
Variational mode decomposition
INTRODUCTION
Runoff is the result of a comprehensive influence of the environment, including various climatic factors and human activities in a basin. Runoff changes with the environment as combined effect of climate change and human activities (Yong et al. 2017; Luo et al. 2019; Shao et al. 2021). Since fluctuations of runoff are mainly due to environmental change, the needs arise to have higher runoff prediction standards and requirements. Runoff prediction and prevention of natural disasters caused by climate change should be considered as key engineering measures. The prediction needs to account for the actual regional environmental factors and produce accurate, stable and effective results (Wang et al. 2021). It is important to know that accurate runoff forecasts are essential for water supply, flood control, drought relief, hydropower and shipping (Fang et al. 2019; Xu et al. 2022). Due to the influence of climate change, landforms, geographical locations, human activities and other environmental factors, the characteristics of runoff series are often non-linear, non-stationary, complex, uncertain, multi-scale, etc. At present, there is no perfect model to accurately describe its evolution process. Therefore, careful analysis of runoff sequence is needed to further improve the accuracy of the runoff prediction model (Zhao & Chen 2015). Traditional methods of the runoff prediction include the empirical correlation, mathematical statistics, probability theory, genetic analysis methods and so on. Nowadays, with the continuous progress of computer technology and algorithms, many new prediction methods are gradually adopted. For example, artificial neural network, support vector machine (SVM), relevance vector machine, deep recursive neural network, long short-term memory, least squares SVM, Elman neural network, data-augmented neural network model, extreme learning machine, graph neural network, auto-regressive integrated moving average model and so on (Okkan & Serbes 2012; Jajarmizadeh et al. 2014; Wang et al. 2015b; Niu et al. 2018; Yuan et al. 2018; Büyükşahin & Ertekin 2019; Li et al. 2019; Ruiming 2019; Bi et al. 2020; Zhang et al. 2021; Liu et al. 2022). In addition, there are examples of new hybrid prediction methods, such as the ML model developed by combining SVM, Artificial Neural Network (ANN) and Long Short-Term Memory (LSTM) (Essam et al. 2022b). So far, SVM is widely used. This method has been proven to have good performance in regression and time series prediction and can better solve practical problems such as small samples, non-linearity and local minima, it is also considered to be able to replace the auto-regressive and moving average models (Thissen et al. 2003). In the application of the model, the parameter setting is particularly important and different parameter values will bring different prediction results. Therefore, selecting the correct algorithm and parameter tuning process for machine learning problems is crucial to achieve the expected results. In previous studies, some scholars have used Bayesian and forest-based algorithms to optimize the parameters of three neural networks (Chong et al. 2022b). Of course, SVM parameters also need to be optimized. For example, Xing et al. (2016) used the bat algorithm to optimize SVM to seek its optimal learning parameters so as to obtain runoff prediction results, improve the effect of runoff prediction and meet the accuracy requirements of runoff prediction.
The characteristics of the runoff process, such as non-linearity, non-stationary, complexity, uncertainty and multi-scale, render it difficult to predict runoff. The runoff series can be decomposed into a series of relatively stable components by using decomposition technology. It was proven that decomposition technology can effectively improve the accuracy of runoff prediction. Therefore, many scholars combined decomposition technology with a runoff prediction model to form the ‘decomposition-prediction-reconstruction’ method (Wang et al. 2013, 2015a). The basic idea is as follows: firstly, the runoff series is divided into several stable components by decomposition technology. The model is then used to predict each decomposed component. Finally, the prediction results of each component are combined to obtain the prediction results of the whole runoff series (Ji et al. 2021). For example, He et al. (2020) adopted the method of coupling variational mode decomposition with a gradient propulsive regression model. First, the Variational mode decomposition (VMD) was used to decompose the original monthly runoff sequence into several sub-sequences. Then, the optimal number of input variables was selected and the Gradient Boosting Regression Tree (GBRT) model was used for prediction. Finally, the prediction results of each sub-sequence were aggregated to obtain the ensemble prediction results. It was trained and tested with 50 years of runoff data in the Weihe River Basin and results showed that it obtained better and more accurate runoff prediction results. Zhao et al. (2017) established a prediction model coupled with empirical mode decomposition and chaotic least squares SVM to predict the annual runoff of four hydrological stations in the upper reaches of the Fenhe River Basin. First, the original annual runoff sequence was decomposed into a finite number of intrinsic mode functions (IMFs) and a trend term to make the sequence stationary. Then, if the IMF component had chaotic characteristics, Least Squares Support Vector Machine (LSSVM) was used for prediction; if the IMF component did not have chaotic characteristics, a polynomial method was used for simulation. In addition, the gray model was used to predict the retention trend item. Finally, by combining the prediction results of IMFs and trend terms, runoff prediction results with less error and higher precision were obtained. Niu et al. (2019) analyzed the changing trend of runoff at Three Gorges Hydrological Station during the past 50 years. The selected method not only combined Ensemble Empirical Mode Decomposition (EEMD) decomposition and the Extreme Learning Machine (ELM) model but also used an improved gravitational search algorithm (IGSA) to optimize the extreme learning machine. The method first used EEMD to decompose the original runoff data into a finite number of sub-sequences and residuals. Then, the ELM model was used to predict the sub-sequence, residuals and IGSA (based on an elite guided evolution strategy, selection operator and mutation operator) was used to optimize the parameters of the ELM model. Finally, all forecasting results were summarized to get the final forecast results, which proved further improvement in the precision of runoff prediction. It was known that in the prediction of runoff, a hybrid model in general outperformed the single prediction model (Zhao & Chen 2015; Niu et al. 2019; Liu et al. 2022; Zhang et al. 2022).
This article adopts the model of ‘decomposition-predictor-reconstruction’. Different from most studies, this paper has two novel points. The first point is the decomposition part, which combines the improved complete ensemble EMD (ICEEMDAN) and the wavelet decomposition (WD) to form the second decomposition. The second point lies in the prediction part. In this part, a seagull optimization algorithm (SOA) is used to optimize SVM, forming a new SOA–SVM prediction model that has never been used before. The effect of the model is verified in the monthly runoff prediction of Hongjiadu Hydropower Station in the Wujiang River Basin. Results show that the hybrid model can effectively improve the accuracy of monthly runoff prediction and offer a novel method of runoff prediction. All methods and technologies adopted in this paper are implemented in the MATLAB software.
The rest of the paper is arranged as follows: Section 2 introduces the basic theories and algorithms of the ICEEMDAN, WD, SOA and SVM and describes the construction of the runoff prediction model. In Section 3, four performance evaluation indexes are outlined. In Section 4, the study area, data are introduced and the results of runoff prediction are described, compared, analyzed and discussed. Finally, Section 5 states the conclusion.
METHODOLOGY
ICEEMDAN
The ICEEMDAN, which was proposed by Colominas et al. (2014), is an improved algorithm for complete ensemble empirical mode decomposition with adaptive noise. It mainly solves the problems of residual noise and spurious models. The specific computation steps are as in the following Equations (1)–(7):
- (1)
- (2)The EMD algorithm is used to obtain the local mean value of the reconstructed signal and the first residual is obtained by taking their mean value to compute the IMF value:where is the residual of the first decomposition, is the operator to compute the local mean and is the value of the first IMF.
- (3)
- (4)
WD quadratic decomposition
A specific wavelet function consists of a set of specific wavelet filtering coefficients. When the wavelet function is selected, the corresponding wavelet filter coefficients are known. In this paper, the db4 wavelet basis function is used to decompose high-frequency components decomposed by the ICEEMDAN and get the trend sequence.
SOA
SOA is a swarm intelligence optimization algorithm proposed by Dhiman & Kumar (2019). The algorithm is mainly inspired by the migration and aggressive behaviors (foraging behaviors) of seagulls in nature and the algorithm has been proven to be able to solve challenging large-scale constraint problems. Compared with other optimization algorithms, it has strong competitiveness owing to its optimization capabilities and simple computation. SOA accepts real-number code and is thus directly identified through the solution. The count of dimensions is equivalent to the stability of the solution (Lavanya et al. 2022).
The mathematical model of SOA is as follows:
- (1)
Migration behavior
In the migration behavior, the algorithm simulates the seagull individual exploring from one location to another. At this stage, there are three things to watch out for: avoiding collisions, movement toward the best neighbor's direction and remaining close to the best search agent.
The application process of the seagull algorithm is as follows:
Step 1: Initialize parameters.
Step 2: Compute the fitness value of each seagull;
Step 3: Compute the new position of seagulls according to the equation in migration behavior;
Step 4: Compute the attack position of seagulls according to the equation in attack behavior;
Step 5: Update the location information and adaptive value of the best seagull .
Step 6: If , go to Step 7; otherwise, go to Step 3.
Step 7: Output the best position and adaptive value of seagulls.
SVM
Construction of a monthly runoff prediction model
Step 1: Decomposition: The original monthly runoff time series is decomposed into several IMFs and one residual (Res) by the ICEEMDAN method.
Step 2: Quadratic decomposition: WD method is adopted to carry out WD of IMF high-frequency components in Step 1 to obtain a more stable trend sequence.
Step 4: Model optimization: This paper uses SOA to optimize SVM and the unprecedented SOA–SVM prediction model is finally established.
Step 5: Model application: Each component decomposed by the ICEEMDAN and WD is input into the SOA–SVM model for training and the result of each component is predicted, respectively. Finally, the prediction results of each sub-sequence are superimposed to obtain the final monthly runoff prediction value.
EVALUATION INDICATORS
In addition, boxplot, violin plot and Taylor diagram are used to analyze the final prediction results of different prediction models. The Taylor diagram can centrally represent the relevant statistical information of multiple models and display the results of three evaluation indicators on one chart (Taylor 2001). The three evaluation indexes used in the Taylor diagram of this paper are RMSE (normalization), NSEC and R. A boxplot is a simple tool for describing statistics, mainly used to reflect the central location and spread range of one or more sets of continuous quantitative data distribution and to identify outliers in the data (Tareen et al. 2019). A violin plot is a combination of boxplot and density plot, which is used to show the distribution state of multiple groups of data and probability density. The boxplot is located inside the violin plot, flanked by the density map of the data, showing multiple details of the data (Tanious & Manolov 2022).
CASE STUDIES
Study area and dataset
Hongjiadu Reservoir in the upper reaches of the Wujiang River in Guizhou Province and Manwan Reservoir in the middle reaches of the Lancang River in Yunnan Province are selected as case studies in this paper. Hongjiadu Reservoir is a multi-year regulating reservoir with a mixed type of mountain canyon and lake. The total reservoir capacity is 4.947 billion m3, the regulated storage capacity is 3.361 billion m3, the total installed capacity is 0.6 million kW, the basin area is about 9,900 km2, the annual average flow is 155 m3/s and the annual average runoff is 4.89 billion m3. The Hongjiadu Hydropower Station is the only power station in the Wujiang River that has the capacity to regulate water quantity for many years. It is mainly used for power generation and also has comprehensive functions of flood control, water supply, breeding, navigation, tourism and ecological protection. The dam site of Manwan Reservoir is located in a narrow valley, which is a seasonal regulating reservoir. The total reservoir capacity is 920 million m3, the regulated storage capacity is 258 million m3, the total installed capacity is 1.5 million kW, the basin area is about 114,500 km2, the annual average flow is 1,230 m3/s and the annual average runoff is 38.8 billion m3. Manwan Hydropower Station is the first phase of the development of the mainstream of the Lancang River, the completion of which plays a vital role in the economic development of Yunnan Province. In recent years, under the influence of climate change and human activities, natural disasters occur more and more frequently, so it is of great practical significance to establish an accurate monthly runoff prediction model.
ICEEMDAN decomposition results
Results of the quadratic decomposition of WD
Application of the SOA–SVM runoff prediction model
. | Hongjiadu . | ||||
---|---|---|---|---|---|
model . | . | . | C . | g . | p . |
SOA–SVM | 135.674 | 4.53286 | 0.107157 | ||
ICEEMDAN–WD–SOA–SVM | IMF1 | d1 | 1000 | 21.182 | 1 |
d2 | 73.1989 | 4.70714 | 0.000188029 | ||
d3 | 95.0232 | 2.89931 | 0.0462714 | ||
a3 | 13.769 | 3.24761 | 0.000507261 | ||
IMF2 | 901.85 | 2.0286 | 0.901826 | ||
IMF3 | 136.066 | 8.86452 | 0.0561698 | ||
IMF4 | 184.905 | 4.65206 | 0.273467 | ||
IMF5 | 41.1947 | 10.1849 | 0.0330585 | ||
IMF6 | 969.004 | 14.9353 | 0.00242618 | ||
IMF7 | 78.1709 | 31.9604 | 0.0041999 | ||
IMF8 | 44.3877 | 5.03156 | 0.00629108 | ||
R | 1,000 | 32.8223 | 0.000112655 |
. | Hongjiadu . | ||||
---|---|---|---|---|---|
model . | . | . | C . | g . | p . |
SOA–SVM | 135.674 | 4.53286 | 0.107157 | ||
ICEEMDAN–WD–SOA–SVM | IMF1 | d1 | 1000 | 21.182 | 1 |
d2 | 73.1989 | 4.70714 | 0.000188029 | ||
d3 | 95.0232 | 2.89931 | 0.0462714 | ||
a3 | 13.769 | 3.24761 | 0.000507261 | ||
IMF2 | 901.85 | 2.0286 | 0.901826 | ||
IMF3 | 136.066 | 8.86452 | 0.0561698 | ||
IMF4 | 184.905 | 4.65206 | 0.273467 | ||
IMF5 | 41.1947 | 10.1849 | 0.0330585 | ||
IMF6 | 969.004 | 14.9353 | 0.00242618 | ||
IMF7 | 78.1709 | 31.9604 | 0.0041999 | ||
IMF8 | 44.3877 | 5.03156 | 0.00629108 | ||
R | 1,000 | 32.8223 | 0.000112655 |
. | Manwan . | ||||
---|---|---|---|---|---|
model . | . | . | C . | g . | p . |
SOA–SVM | 1,000 | 1.60807 | 0.503161 | ||
ICEEMDAN–WD–SOA–SVM | IMF1 | d1 | 1,000 | 2.09022 | 0.0174761 |
d2 | 302.244 | 5.18232 | 0.0224248 | ||
d3 | 17.7589 | 3.92211 | 0.0507203 | ||
a3 | 109.443 | 6.87224 | 0.0733407 | ||
IMF2 | 1,000 | 1.40956 | 0.00877693 | ||
IMF3 | 200.185 | 4.07874 | 0.0436708 | ||
IMF4 | 989.414 | 2.59465 | 0.0922121 | ||
IMF5 | 157.769 | 11.2888 | 0.138869 | ||
IMF6 | 920.794 | 11.3256 | 0.0880676 | ||
IMF7 | 1000 | 2.76988 | 0.0205433 | ||
IMF8 | 208.315 | 47.2232 | 0.0134492 | ||
R | 939.963 | 24.3069 | 0.000144657 |
. | Manwan . | ||||
---|---|---|---|---|---|
model . | . | . | C . | g . | p . |
SOA–SVM | 1,000 | 1.60807 | 0.503161 | ||
ICEEMDAN–WD–SOA–SVM | IMF1 | d1 | 1,000 | 2.09022 | 0.0174761 |
d2 | 302.244 | 5.18232 | 0.0224248 | ||
d3 | 17.7589 | 3.92211 | 0.0507203 | ||
a3 | 109.443 | 6.87224 | 0.0733407 | ||
IMF2 | 1,000 | 1.40956 | 0.00877693 | ||
IMF3 | 200.185 | 4.07874 | 0.0436708 | ||
IMF4 | 989.414 | 2.59465 | 0.0922121 | ||
IMF5 | 157.769 | 11.2888 | 0.138869 | ||
IMF6 | 920.794 | 11.3256 | 0.0880676 | ||
IMF7 | 1000 | 2.76988 | 0.0205433 | ||
IMF8 | 208.315 | 47.2232 | 0.0134492 | ||
R | 939.963 | 24.3069 | 0.000144657 |
RESULTS AND DISCUSSION
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (m3/s) . | MAPE . | NSEC . | R . | RMSE (m3/s) . | MAPE . | NSEC . | R . | |
ARMA (Wang et al. 2009) | 91.56 | 46.62 | 0.521 | 0.727 | 94.34 | 48.03 | 0.584 | 0.786 |
ANN (Wang et al. 2009) | 91.16 | 46.25 | 0.526 | 0.725 | 91.07 | 46.15 | 0.612 | 0.786 |
SVM (Wang et al. 2009) | 89.89 | 28.25 | 0.539 | 0.753 | 87.57 | 33.77 | 0.641 | 0.823 |
SOA–SVM | 84.30 | 24.07 | 0.5943 | 0.7920 | 97.39 | 39.83 | 0.5564 | 0.7822 |
EMD–SOA–SVM | 46.99 | 28.23 | 0.8739 | 0.9366 | 101.24 | 93.06 | 0.5206 | 0.7869 |
CEEMDAN–SOA–SVM | 41.96 | 28.66 | 0.8995 | 0.9484 | 65.17 | 45.67 | 0.8013 | 0.8980 |
ICEEMDAN–WD–SOA–SVM | 26.09 | 21.38 | 0.9612 | 0.9804 | 39.32 | 33.28 | 0.9277 | 0.9659 |
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (m3/s) . | MAPE . | NSEC . | R . | RMSE (m3/s) . | MAPE . | NSEC . | R . | |
ARMA (Wang et al. 2009) | 91.56 | 46.62 | 0.521 | 0.727 | 94.34 | 48.03 | 0.584 | 0.786 |
ANN (Wang et al. 2009) | 91.16 | 46.25 | 0.526 | 0.725 | 91.07 | 46.15 | 0.612 | 0.786 |
SVM (Wang et al. 2009) | 89.89 | 28.25 | 0.539 | 0.753 | 87.57 | 33.77 | 0.641 | 0.823 |
SOA–SVM | 84.30 | 24.07 | 0.5943 | 0.7920 | 97.39 | 39.83 | 0.5564 | 0.7822 |
EMD–SOA–SVM | 46.99 | 28.23 | 0.8739 | 0.9366 | 101.24 | 93.06 | 0.5206 | 0.7869 |
CEEMDAN–SOA–SVM | 41.96 | 28.66 | 0.8995 | 0.9484 | 65.17 | 45.67 | 0.8013 | 0.8980 |
ICEEMDAN–WD–SOA–SVM | 26.09 | 21.38 | 0.9612 | 0.9804 | 39.32 | 33.28 | 0.9277 | 0.9659 |
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (m3/s) . | MAPE . | NSEC . | R . | RMSE (m3/s) . | MAPE . | NSEC . | R . | |
ARMA (Wang et al. 2009) | 354.27 | 16.77 | 0.849 | 0.922 | 354.53 | 15.63 | 0.869 | 0.928 |
ANN (Wang et al. 2009) | 346.31 | 16.16 | 0.856 | 0.925 | 345.37 | 14.01 | 0.867 | 0.9320 |
SVM (Wang et al. 2009) | 334.07 | 12.49 | 0.866 | 0.9315 | 332.86 | 12.49 | 0.8836 | 0.9410 |
SOA–SVM | 342.76 | 13.35 | 0.8585 | 0.9280 | 350.07 | 12.68 | 0.8783 | 0.9422 |
EMD–SOA–SVM | 378.18 | 21.54 | 0.8278 | 0.9103 | 606.47 | 52.66 | 0.6348 | 0.8856 |
CEEMDAN–SOA–SVM | 228.15 | 13.55 | 0.9373 | 0.9691 | 234.21 | 17.97 | 0.9455 | 0.9727 |
ICEEMDAN–WD–SOA–SVM | 110.17 | 7.52 | 0.9854 | 0.9927 | 140.26 | 11.11 | 0.9805 | 0.9903 |
Model . | Training . | Testing . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE (m3/s) . | MAPE . | NSEC . | R . | RMSE (m3/s) . | MAPE . | NSEC . | R . | |
ARMA (Wang et al. 2009) | 354.27 | 16.77 | 0.849 | 0.922 | 354.53 | 15.63 | 0.869 | 0.928 |
ANN (Wang et al. 2009) | 346.31 | 16.16 | 0.856 | 0.925 | 345.37 | 14.01 | 0.867 | 0.9320 |
SVM (Wang et al. 2009) | 334.07 | 12.49 | 0.866 | 0.9315 | 332.86 | 12.49 | 0.8836 | 0.9410 |
SOA–SVM | 342.76 | 13.35 | 0.8585 | 0.9280 | 350.07 | 12.68 | 0.8783 | 0.9422 |
EMD–SOA–SVM | 378.18 | 21.54 | 0.8278 | 0.9103 | 606.47 | 52.66 | 0.6348 | 0.8856 |
CEEMDAN–SOA–SVM | 228.15 | 13.55 | 0.9373 | 0.9691 | 234.21 | 17.97 | 0.9455 | 0.9727 |
ICEEMDAN–WD–SOA–SVM | 110.17 | 7.52 | 0.9854 | 0.9927 | 140.26 | 11.11 | 0.9805 | 0.9903 |
It can be seen from Figures 11–14 that the prediction value obtained by the single SOA–SVM model without any decomposition has a large deviation from the real value and the prediction effect is the worst. The prediction effect of the EMD–SOA–SVM model is not ideal and the predicted runoff series only shows a general change trend. Although the prediction effect of the CEEMDAN–SOA–SVM is obviously better than the first two methods, the fitting effect of the peak value of the measured runoff series is not good and the error is very obvious. The predicted value and real value obtained by the ICCEMDAN–WD–SOA–SVM model have the best fitting effect and the predicted runoff series is closest to the measured runoff series, which reflects the accuracy and superiority of the runoff prediction model.
The prediction results during the test period can better reflect the performance of runoff prediction models. According to the nature of statistical indicators, the smaller the value of RMSE or MAPE, the better the result, while the larger the value of NSEC or R, the better the result. Tables 3 and 4 list the statistical results of the evaluation indicators for the final result data of the Hongjiadu and Manwan Reservoirs predicted by seven models. In addition, the general percentage comparison method is used to compute the statistical results of its evaluation indicators. It can be observed that:
- (1)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of single the ARMA model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 58.32%; MAPE is decreased by 30.70%; NSEC is increased by 58.85%; and R is increased by 22.89%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 60.44%; MAPE is decreased by 28.93%. NSEC is increased by 12.83%; R is increased by 6.72%.
- (2)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of the single ANN model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 56.82%; MAPE is decreased by 27.88%; NSEC is increased by 51.58%; and R is increased by 22.89%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 59.39%; MAPE is decreased by 20.71%. NSEC is increased by 13.09%; R is increased by 6.26%.
- (3)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of a single SVM model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 55.10%; MAPE is decreased by 1.44%; NSEC is increased by 44.73%; and R is increased by 17.36%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 57.86%; MAPE is decreased by 11.06%. NSEC is increased by 10.96%; R is increased by 5.24%.
- (4)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of the single SOA–SVM model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 59.63%; MAPE is decreased by 16.44%; NSEC is increased by 66.74%; and R is increased by 23.49%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 59.93%; MAPE is decreased by 12.40%. NSEC is increased by 11.63%; R is increased by 5.10%.
- (5)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of the EMD–SOA–SVM model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 61.16%; MAPE is decreased by 64.23%; NSEC is increased by 78.19%; and R is increased by 22.75%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 76.87%; MAPE is decreased by 78.91%. NSEC is increased by 54.45%; R is increased by 11.82%.
- (6)
Comparing results of the ICEEMDAN–WD–SOA–SVM model with those of the CEEMDAN–SOA–SVM model during the test period at the Hongjiadu Reservoir they are as follows: RMSE is reduced by 39.67%; MAPE is decreased by 27.13%; NSEC is increased by 15.77%; and R is increased by 7.56%. Results obtained at the Manwan Reservoir are as follows: RMSE is reduced by 40.11%; MAPE is decreased by 38.17%. NSEC is increased by 3.69%; R is increased by 1.82%.
It can be intuitively observed from Figures 15–18 that the RMSE, NSEC and R computed by the ICEEMDAN–WD–SOA–SVM model in the four Taylor charts are significantly superior to other models. It can be directly concluded that the predicted value obtained by the ICEEMDAN–WD–SOA–SVM model is closest to the real value and has the highest accuracy. At the same time, it can be seen that the prediction effects of the ARMA, ANN, SVM, SOA–SVM and the EMD–SOA–SVM models without any decomposition are very unsatisfactory and the performances of the three evaluation indicators are very poor. The prediction effect of the CEEMDAN–SOA–SVM model is obviously better than those of the previous five, but it is obviously worse than that of the ICEEMDAN–WD–SOA–SVM model.
It can be seen from the four boxplots in Figures 19–22 that the boxes of each model in the figure are not very different from the actual values (the upper quartile, median and lower quartile are approximately the same). By observing upper and lower limits, the comprehensive evaluation shows that the ICEEMDAN–WD–SOA–SVM model suits most with the actual value. More critically, through the comparison of outliers (highlighted in the figure), it can be directly concluded that the predicted outliers obtained by the ICEEMDAN–WD–SOA–SVM model are closest to the real outliers, which proves that the ICEEMDAN–WD–SOA–SVM model has the best prediction results.
It can be seen from Figures 23–26 that the the violin plot is similar to the box plot in that it focuses less on outliers and more on the distribution of the data, including the distribution profile and the distribution region, that is, the most concentrated region of data. As can be seen from the four violin plots, the data set regions of the four models are almost parallel and are basically near the same value. At the same time, it can be seen that the data distribution shape of the ICEEMDAN–WD–SOA–SVM model in the four figures is the closest to the actual data distribution shape. By combining the boxplot and violin diagram, it can be seen that the ICEEMDAN–WD–SOA–SVM model is superior in both data distribution and outliers.
In summary, the ICEEMDAN–WD–SOA–SVM model proposed in this paper has the highest prediction accuracy results and the best fitting effect. This indicates that the runoff prediction performance of the ICEEMDAN–WD–SOA–SVM model is superior to those of the ARMA, ANN, SVM, SOA–SVM, EMD–SOA–SVM and the CEEMDAN–SOA–SVM models. This model has high feasibility, reliability and is suitable for predicting monthly runoff time series.
CONCLUSION
Medium- and long-term runoff prediction is of great significance to the rational development and utilization of water resources. In order to improve the accuracy of monthly runoff prediction, this paper proposes an ICEEMDAN–WD–SOA–SVM runoff prediction model. The main conclusions are as follows:
- (1)
The ICEEMDAN–WD method has advantages in dealing with non-stationary, non-linear runoff time series. The ICEEMDAN method can effectively reduce the complexity of the original runoff series and make the runoff series as smooth as possible. The WD method can deal with high-frequency components of the runoff series decomposed by the ICEEMDAN method. After the decomposed data is input into the prediction model, the prediction result is more accurate and accurate. Thus, a novel method for runoff series decomposition is proposed.
- (2)
Compared with the single SOA–SVM model without decomposition, the prediction accuracy of the ICEEMDAN–WD–SOA–SVM model adopted in this paper is much higher, which proves the necessity of decomposing the runoff series. It is proved by the fact that runoff prediction using the ‘decomposition-prediction-reconstruction’ model can further improve the prediction accuracy and get more accurate prediction results.
- (3)
Based on the comparison of images and evaluation indicators, the prediction effect of the ICEEMDAN–WD–SOA–SVM model proposed in this paper is better than those of the ARMA, ANN, SVM, SOA–SVM, EMD–SOA–SVM and the CEEMDAN–SOA–SVM models. The peak fitting degree is higher and the error is smaller. The reliability and validity of the prediction ability of the proposed model are verified, which provides a reference for monthly runoff prediction and related research in other basins.
- (4)
The ICEEMDAN–WD–SOA–SVM model proposed in this paper effectively improves the accuracy of runoff prediction and is a feasible, reliable and novel method. However, this paper only considers the prediction effect of monthly runoff time series and runoff series of different time scales can be predicted in the future to verify the universal applicability of this runoff prediction model.
ACKNOWLEDGEMENTS
The authors are grateful to the support of Special project for collaborative innovation of science and technology in 2021 (No.: 202121206) and Henan Province University Scientific and Technological Innovation Team (No.: 18IRTSTHN009).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.