Accurate water level prediction is of great importance for water infrastructures such as dams, embankments, and agriculture. However, the water level has nonlinear characteristics, which make it very challenging to accurately predict the water level. This study proposes a combined model based on variational mode decomposition (VMD), a genetic algorithm–the ELMAN neural network–VMD–the autoregressive integrated moving average (ARIMA) model (GA–ELMAN–VMD–ARIMA). Firstly, VMD preprocesses the original water level and predicts each subsequence with the GA–ELMAN model. Then the error sequence is decomposed by VMD and predicted by the ARIMA model. Finally, the predicted water level is corrected. Using three groups of data from different sites, 10 models are established to compare the performance of the model. The results show that the combination of the VMD algorithm and the GA–ELMAN model can improve the performance of prediction on datasets. In addition, it also shows that the VMD double processing can greatly improve the prediction accuracy.

  • The variational mode decomposition (VMD) double processing is used for water level prediction.

  • A genetic algorithm is used to optimize the hyperparameters of the ELMAN neural network.

  • The error correction method is used to improve the prediction accuracy.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Background

Hydrological system modeling has become common in the last few decades due to its overwhelming importance for understanding the earth system. Habeeb & Talib (2021) combined the Geographic Information System (GIS) with remote sensing, Internet of Things, and Web to manage and monitor the water quality. Ekwueme & Agunwamba (2021) used the Mann–Kendall technique to effectively analyze the air temperature and rainfall trends in regional basins. Nazarnia et al. (2020) proposed that sea level rise (SLR) has an impact on the world's coastal areas and coastal infrastructures such as water, transportation system, and energy supplement system. Besides, the water level is an important hydrological variable that can reflect the reserve capacity of the hydrological system and determine the load capacity and self-regulation limit of the hydrological system. The water level has a great impact on the ecological environment and on human life for navigation, flood control, and agricultural irrigation.

The prediction models of the water level are mainly divided into two categories: the process-driven model and data-driven model (Abrahart et al. 2012; Blschl et al. 2019). The process-driven prediction model simulates the change process of a river's water level on the basis of hydrology and establishes the mathematical model of water level prediction, but this method requires a large amount of water level data. In contrast, the data-driven prediction model needs less data and can simulate the nonlinear and non-stationary characteristics of hydrological processes with the smallest observation data. There is no need to consider the hydrological background of water level height change; it only depends on the historical data of water level height. Prediction can be achieved by learning the relationship between historical data and future water level data (Li et al. 2020). The predictive performance of the data-driven model is better than that of the process-driven model (Kalteh 2016).

Data-driven models in hydrological systems mainly include statistical methods, such as the autoregressive moving average (ARMA) and the multiple linear regression (MLR), as well as artificial intelligence methods, such as the ELMAN neural network (ENN) (Lei & Wang 2019), the artificial neural network (ANN) (Rao 2000), the classification and regression tree (CART) (Yang et al. 2016), the genetic expression programming (GEP) (Kiafar et al. 2016), the genetic programming (GP) (Khu et al. 2010), the extreme learning machine (ELM) (Shiri et al. 2016), the gated recurrent unit (GRU), the convolutional neural network (CNN) (Pan et al. 2020), the support vector machine (SVM) (Behzad et al. 2009), the adaptive neuro-fuzzy inference system algorithm (ANFIS) (Sun & Trevor 2018), the stack algorithm (Tyralis et al. 2019), the enhancement algorithm (Li et al. 2016), the random forest (RF) (Fathian et al. 2019), etc. Fu et al. (2020) selected the Kelantan River in the northeast of the Malaysian Peninsula to test the impact of the size of the training set, the time interval between the training set and the test set, and the time span of the predicted data on the performance of the developed long short-term memory (LSTM) model. The experimental results show that the model can handle the stable streamflow data in the dry season and the rapid fluctuation streamflow data in the rainy season. In the last few years, the ENN has been applied to many fields. Brunelli et al. (2007) used the ENN to predict the daily maximum concentrations of pollutants such as sulfur dioxide (SO2), ozone (O3), inhalable particles (PM10), nitrogen dioxide (NO2), and carbon monoxide (CO) in Palermo, and the predicted values were in agreement with the actual values. Wu et al. (2011) used the improved ENN to predict the high PM10 air pollution index (API) events caused by sandstorm activities with better accuracy than the standard ELMAN model. Recently, Wang et al. (2021) used an ENN to predict stock problems. They also showed the reliability of the ENN model prediction.

However, the traditional single method is not enough to meet people's demand for accuracy. For example, Mosavi et al. (2018) proposed that mixing, data decomposition, algorithm integration, and model optimization are the most effective strategies for improving machine learning (ML) methods, which can select the appropriate ML methods based on predicted needs in hydrology and climate. Ebtehaj et al. (2021) established two hybrid models of Generalized Structure Group Method of Data Handling (GS-GMDH) and ANFIS with Fuzzy C-Means (ANFIS-FCM) based on the data of two water level stations in the Perak River, Malaysia, which have good accuracy in the river water level prediction. A hybrid model is established, which generally includes a prediction model based on the pretreatment method or by combining the optimization algorithm to optimize the parameters of the neural network. Preprocessing decomposes the original data and eliminates anomalies and denoising to make the sequence more stable and reduces the complexity. Fotovatikhah et al. (2018) believe that the hybrid method is the best choice to deal with flood management through computational intelligence (CI), which has the potential to improve the accuracy and lead time of flood and debris prediction, and the wavelet square method has strong integration ability. In recent years, the combination of a wavelet transform and artificial intelligence technology has been successfully applied to hydrology. In the last few years, a wavelet transform (Khan et al. 2020), a singular spectrum analysis (SSA) (Wu & Chau 2013; Wang et al. 2020), and other methods have been widely used in the hydrological preprocessing. Altunkaynak & Kartal (2019) used joint discrete wavelet transform-fuzzy (DWT-fuzzy) and joint continuous wavelet transform-fuzzy (CWT-fuzzy) models. The prediction performance of the CWT-fuzzy was better than that of the DWT-fuzzy and single fuzzy models. Seo et al. (2015) established two hybrid models of ANNs based on wavelets (WANNs) and adaptive neural fuzzy inference systems based on wavelets (WANFIS) to predict daily water levels. Although these techniques can improve the prediction accuracy to a large extent, Du et al. (2017) proposed that the hybrid model constructed by the SSA and DWT had problems while in use. Because the SSA and DWT were computed from ‘future’ values, the subseries generated by SSA reconstruction or DWT decomposition contain information about ‘future’ values. Thus, the mixed model had incorrect ‘high’ prediction effects, which led to large errors in practice.

To avoid such potential problems, empirical mode decomposition (EMD) (Zhao et al. 2017), ensemble EMD (EEMD) (Wang et al. 2015), complete ensemble EMD with adaptive noise (CEEMDAN) (Wen et al. 2019), improved complete integrated EMD and adaptive noise decomposition (ICEEMDAN) (Wang et al. 2019), and variational mode decomposition (VMD) (Niu et al. 2020) are also used for the hydrological preprocessing. Xi et al. (2017) used the EMD-Elman model to predict a monthly runoff. The experimental results showed that the combined model had higher prediction accuracy than the Elman model and could be suitable for complex hydrological sequences. Wen et al. (2019) used a data-driven method to design a two-phase hybrid model (CVEE-ELM), adopted the complete integrated EMD of adaptive noise (CEEMDAN), combined the method with VMD, and used the ELM algorithm to predict the multiscale runoff problem. The CVEE-ELM model had notable advantages, especially in the extensive analysis of predicted and observed datasets. Niu et al. (2020) used the mixed model of VMD and an ELM, and the gravity search algorithm (GSA) to find the optimal super parameters of ELM to predict the annual runoff of the reservoir. The experimental results showed that the performance of the various indicators of the proposed model was better than that of the autoregressive integrated moving average (ARIMA) and ELM models, indicating that the proposed model is an effective tool for predicting a runoff.

Although data preprocessing can reduce the error to a large extent, the researchers found that adding an optimization algorithm can further improve the prediction accuracy. Wang et al. (2020) denoised the monthly runoff data of Zhengyi gorge of the Heihe River through the SSA in 2020, and the grey wolf optimization (GWO) algorithm was used to jointly optimize the parameter penalty factor c and the kernel function parameter of the support vector regression (SVR) model, which enhanced the generalizability. The results showed that the proposed model had higher prediction accuracy than the persistent model (PM), ARIMA, cross validation-SVR (CV-SVR), and GWO-SVR models, especially for tracking and forecasting the peak runoff during the flood season. Cong & Meesad (2013) proposed a firefly algorithm (FA), a paper swarm optimization (PSO), and a genetic algorithm (GA) to optimize the type 1 and type 2 TSK fuzzy logic systems to predict the hourly sea level of the Nha Trang Sea in Vietnam. Taormina & Chau (2015) used the LUBE method to test the applicability of estimates of production prediction intervals (PIs) under different confidence levels (CL) of river flow in the first 6 h of Susquehanna and Inner Harlem rivers in the United States. The results show that the neural network trained by MOFIPS is superior to the neural network developed by single-objective group optimization. Kisi et al. (2015) used a multi-step advanced prediction model based on the SVM and an FA (SVM–FA) to predict the daily water level of the Lake Urmia in northwestern Iran. It was proven that this model was superior to GP and ANN models. Yao et al. (2018) combined the GA to optimize the weights and thresholds of the ENN to predict the historical water level of the Yongding River monitoring site. The experimental results showed that the GA optimized Elman network (ENN) was more effective than the single-model Elman, and back propagation (BP) networks were more accurate. This study aims to design the VMD double processing and optimize Elman parameters with a GA. Therefore, the proposed method may be helpful for applications involving multiple domains such as runoffs, precipitation, and stocks.

Contribution of the paper

Water level series are complex and nonlinear time series, which are influenced by environmental factors. In the past, most of the water level studies are based on the original water level series. The prediction of a single model cannot accurately describe the fluctuation of data, which leads to the low prediction accuracy of the model. To overcome the above problems and improve the water level prediction performance, this paper proposes a new hybrid model strategy based on the VMD double processing, GA–ELMAN, and ARIMA prediction. Specific contribution is as follows:

  • (1)

    Based on the VMD double processing strategy, the original water level data and error sequence are decomposed, and the non-stationary water level is decomposed into multiple band-limited intrinsic mode functions, which reduces the complexity and non-stationarity of the data.

  • (2)

    The learning rate and the momentum coefficient of ELMAN are optimized by the GA, and the optimal learning rate and the momentum coefficient are found by verification set search. This strategy can improve the prediction accuracy of the model.

  • (3)

    The error correction model is used to improve the overall prediction accuracy of the model, avoiding the influence of the error sequence to reduce the prediction accuracy.

  • (4)

    The accuracy and effectiveness of the proposed VMD–GA–ELMAN–VMD–ARIMA hybrid model in water level prediction are verified by using the water level datasets of three different stations and 10 comparison models.

The structure of the paper

The rest of this paper is organized as follows: Section 2 introduces the basic theory, method, and framework of the hybrid model in detail. Section 3 describes the simulation experiments used to validate the model. Using three sets of hydrological data, the VMD–GA–ELMAN–VMD–ARIMA model is compared with the other 10 models to verify the prediction results and the performance of the model. Section 4 discusses the advantages of the prediction model. Section 5 summarizes the model.

The framework of the hybrid model

The frame diagram of the hybrid model is shown in Figure 1. The key part of the model is that the GA optimizes the parameters of the ELMAN and VMD double processing and the error correction method to improve the prediction accuracy.

Figure 1

Flow chart of the VMD–GA–ELMAN–VMD–ARIMA model.

Figure 1

Flow chart of the VMD–GA–ELMAN–VMD–ARIMA model.

Close modal

The framework is divided into the following steps: (1) VMD is used to decompose the original water level series into several subseries. (2) When ELMAN is used to predict each subsequence, the GA is first used to optimize the hyperparameter learning rate and the momentum coefficient of ELMAN to avoid the shortage of manual adjustment parameters, and finally the prediction sequence and error sequence are obtained. (3) The ARIMA model is used to predict the error sequence decomposed by VMD for the second time, and finally the prediction error sequence is obtained. (4) The prediction sequence in the second step is modified by the prediction error sequence to obtain the final predicted water level data.

This paper mainly introduces the VMD, ICEEMDAN, ELMAN, LSTM, and GRU applied in the hybrid model. Since the MLP structure is simple, the prediction effect is not very good in this experiment, and though the ARIMA model is more familiar, it is no longer introduced here.

Variational mode decomposition

VMD is an adaptive and quasi-orthogonal signal decomposition method newly developed by Dragomiretskiy and Zosso (Dragomiretskiy & Zosso 2014). VMD decomposes the signal x(t) into k variational modes (Liu et al. 2018a).

VMD can be written as a constrained variational problem (Dragomiretskiy & Zosso 2014): (1) the Hilbert transform is used to calculate the related analytical signals to obtain the unilateral spectrum; (2) the mixing index is adjusted to the respective estimated center frequency, and the modal spectrum is moved to the baseband; and (3) the bandwidth uk of each mode is estimated using the Gaussian smoothness and the gradient square criterion (Hu et al. 2021). Therefore, the constrained variational problem can be defined as follows:
(1)
(2)
where and are the kth decomposition mode and center frequency after the decomposition of the original signal. K is the total modulus to be decomposed; is the partial derivative of the function with respect to t; is the Dirac distribution function. In Equation (2), a quadratic penalty term and Lagrange multiplier can be introduced to determine the Lagrange equation, which is shown in Equation (3) as follows:
(3)

In the equation, represents the balance parameter of the data fidelity constraint. Equation (3) is solved by the alternating direction method of multiplication (ADMM) (Hestenes 1969). The saddle points of the Lagrange function are obtained by iterative updating , , and from the ADMM (Bai et al. 2021).

  • (1)

    Initialize each mode component and center frequency.

  • Set the initial values of , , , and n to 0 and the resolution times K to a positive integer.

  • (2)
    Update and through the following equation as follows:
    (4)
    (5)
    where , , , and are the Fourier transform of , , , and , and n is the number of iterations.
  • (3)
    Update as follows:
    (6)

In the equation, is an iterative factor. Evaluate the setting of precision . If the conditions of Equation (6) are met, end the iteration; otherwise, return to step (2) and perform the calculation again until the conditions are met, before ending the iteration. Finally, K modal components are obtained, and the variational modal distribution decomposition is completed as follows:
(7)

Improved complete ensemble EMD with adaptive noise

In the last few years, Marcelo et al. proposed a new technology called the improved complete ensemble EMD with adaptive noise (ICEEMDAN). ICEEMDAN is a new decomposition technology based on EMD. Compared with EMD, it solves the problem of frequency aliasing and pseudo-mode encountered in previous studies. Adding white noise can make the frequencies between adjacent scales that have continuity and weaken the influence of frequency mixing. The following equation is the main step of ICEEMDAN (Colominas et al. 2014):

  • (1)
    Determine the EMD implementation as follows:
    (8)
    where i, x, , , and are the added white noise figure, original signal, decomposed signal, white noise, and standard deviation calculator. represents the first component of white noise, , and is the reciprocal of the desired signal-to-noise ratio between the first added noise and the analyzed signal.
  • (2)
    The first residue is as follows:
    (9)
  • (3)
    The first mode expression of the first stage of the N signal decomposition is as follows:
    (10)
  • (4)
    The second residual and mode are determined as follows:
    (11)
  • (5)
    The kth residue and mode are determined as follows:
    (12)
    (13)
    (14)
  • (6)

    Repeat step (4) for the next k stages.

ELMAN neural network

ELMAN proposed the ENN in 1990 for speech processing (ELMAN 1990). It is a simple feedforward neural network, which is composed of four layers: an input layer, a receiving layer, a hidden layer, and an output layer. As a delay operator, the receiving layer has a memory function. It can store the output information of the hidden layer and relay the previous state in the next iteration (Liu et al. 2018b), which can be used to solve the fast optimization problem. This paper selects the optimization algorithm to find the best learning rate and the momentum coefficient of ELMAN. The calculation results are as follows:
(15)
(16)
where xt represents the input vector, ht represents the hidden layer vector, yt represents the output vector, W and V represent the weight matrix, a represents the deviation vector, and and represent the activation function. The structure of ELMAN is shown in Figure 2.
Figure 2

ELMAN structure diagram.

Figure 2

ELMAN structure diagram.

Close modal

Long short-term memory network

LSTM is a traditional neural network (RNN), which can solve the problem of disappearance or gradient explosion often encountered by traditional RNNs during training (Hochreiter & Schmidhuber 1997). LSTM consists of three gates: the input gate, the output gate, and the forgetting gate. The ‘gate’ of a long short-term storage network is a special network structure. Its input is a vector and the output range is 0–1. When the output value is 0, no information transmission is refused; when the output value is 1, all information is allowed to pass through (Duan et al. 2021).

Assuming that input is the existing original data and output is the prediction data, the calculation process of the network is as follows (Zhang et al. 2017):
(17)
(18)
(19)
(20)
(21)
(22)
where , , and are an input gate, the forgetting gate, and an output gate, respectively, and is a sigmoid function. It can be seen from the equation that , , and are determined by the sigmoid function. represents the multiplication of each element between vectors, is a hyperbolic tangent function, is the memory cell at the current moment, and represents the deviation vector. The ownership value matrix is updated by the difference between the output value of the error BP algorithm and the actual value (Yan et al. 2018), and kt is the equation of the hidden state. The structure of the LSTM is shown in Figure 3.
Figure 3

LSTM structure diagram.

Figure 3

LSTM structure diagram.

Close modal

Gated recurrent unit

The GRU was proposed by Cho et al. in 2014 to solve the long-term dependence of recurrent neural networks (Cho et al. 2014). The GRU neural network is evolved from the LSTM, which contains two gates, reset gate and update gate, replacing the forgetting gate, input gate, and output gate in the LSTM neural network. The GRU is easier to calculate and implement than the LSTM. The update gate control is passed to the previous information of the current layer, and the reset gate is the amount of information that is decided to forget. The principle of GRU prediction is to use the gate unit control history and current information for prediction in the current step (Xu et al. 2020). The LSTM and the GRU neural networks have similar data in the hidden layer, but there is no separate storage unit in the GRU neural network. Thus, the sample training efficiency is higher. The calculation equation is as follows:
(23)
(24)
(25)
(26)
where is the hidden layer information of the previous moment and kt is the hidden layer information of the current moment. rt and ut are the reset gate and update gate of GRU, respectively. The candidate hidden layer yt measures how much hidden layer information is retained at the previous time by calculating rt. The number of candidate hidden layers to be added to yt can be calculated by ut and finally the current output kt is obtained. The structure of the GRU is shown in Figure 4.
Figure 4

GRU structure diagram.

Figure 4

GRU structure diagram.

Close modal

In the figure, ‘1’ means subtracting each element in the vector with 1.

Genetic algorithms

The GA was proposed by John Holland et al. in the late 1960s. The GA is a metaheuristic optimization algorithm supporting global search, which is used to solve complex optimization and high-dimensional problems with or without constraints (Prado et al. 2020). In this paper, the GA is used to optimize the learning rate and momentum coefficient of the ELMAN model, and the set optimization interval is [0–1]. The code string formed by these two optimized parameters is simulated into a biological evolution process, generating the next generation through selection, crossover, mutation, and other operations. The fitness value of the individual in the group is continuously improved until certain termination conditions are met (Liu et al. 2014, 2015). The optimization steps are shown in Figure 1. The ratio of the training set, the validation set, and the test set is 8:1:1. The data of the validation set are used to find the optimal parameters. The following are steps to optimize the algorithm:

  • (1)

    Initialize the population: The network learning rate and momentum coefficient are initialized, and the real number is coded. The population number is 40 and the algebra is set to 10.

  • (2)
    Based on the fitness density function, the fitness value of each chromosome in the population is searched, and the reciprocal of the root mean square error (RMSE) is taken as the fitness function as follows:
    (27)

In the equation, f(j) is the fitness value of chromosome j, represents the RMSE between the actual output ni of the learning rate and momentum coefficient determined by chromosome j and the predicted output and L is the number of input samples for the training set in the network.

  • (3)

    Perform a genetic operation: Calculate the fitness value of each chromosome, determine the optimal fitness individual, and complete the operation. Otherwise, start the next round of operations from (2) until the most satisfactory individual is found.

  • (4)

    Obtain the learning rates and momentum coefficients of the ENN: After GA optimization, a set of complete learning rate and momentum coefficient with the smallest error of the ELMAN model are obtained. In the output layer, the actual output ni is compared with the predicted output , and the RMSE of the predicted value and the actual value is calculated as follows: . The evaluation criterion of ELMAN is that the smaller the RMSE, the better. Then, when the fitness function f(j) is the maximum, the learning rate and momentum coefficient of ELMAN are the optimal values.

Data description

The data are from the river water levels at points 4, 6, and 7 in the intensive observation of runoff in the middle reaches of the Heihe River. The observation point is located in Zhangye City, Gansu Province (National Qinghai-Tibet Plateau Scientific Data Center (tpdc. Ac. Cn)). The riverbed is gravel, and the river widths are 58, 50, and 130 m. The interval between each part of the data is 30 min. Three groups of water level data are, as shown in Figure 5, the training set, the validation set, and the test set. The data statistics are shown in Table 1, and are the total data, the number of training sets, the number of validation sets, and the number of test sets. The first 80% of the data is used as the training set, and then the last 10% of the data is used as the validation set of the GA optimization model to optimize the learning rate and the momentum coefficient. Finally, the remaining 10% of the data is used as the test set. The time, place, length, and complexity of the three groups of datasets are different. If the proposed model predicts the best effect of the three groups of datasets, the design of the model is successful.

Table 1

Information statistics of water level data

T/periodTaTbTcMaximum (cm/30 min)Minimum (cm/30 min)Mean (cm/30 min)Std
Data 1 1,024 813 102 101 99 51 73.97 14.01 
Data 2 1,072 851 107 106 240.531 74.89 136.27 35.41 
Data 3 1,648 1,312 164 164 76.992 20.34 41.13 16.86 
T/periodTaTbTcMaximum (cm/30 min)Minimum (cm/30 min)Mean (cm/30 min)Std
Data 1 1,024 813 102 101 99 51 73.97 14.01 
Data 2 1,072 851 107 106 240.531 74.89 136.27 35.41 
Data 3 1,648 1,312 164 164 76.992 20.34 41.13 16.86 
Figure 5

Three groups of data.

Figure 5

Three groups of data.

Close modal

Evaluation metrics

Four evaluation indexes are used to evaluate the prediction performance of the model, namely, the mean absolute error (MAE), the mean absolute percentage error (MAPE), the RMSE, and the linear regression coefficient of determination (). The expressions of these four evaluation indexes are as follows:
(28)
(29)
(30)
(31)
where n is the total number of sampling points, yi and are the actual value and predicted value, respectively, and is the predicted value.

, , , and are usually used to analyze errors in a time series. is the average absolute error between the predicted value and the actual value, which can reflect the actual error value; is an unbiased index that reflects the relative predictability of the model by dividing the absolute error by its corresponding actual value; R2 reflects the degree of data fitting between the actual value and the predicted value; and RMSE reflects the average error of the predicted value compared with the actual value. The smaller the values of , , and , the better the prediction accuracy of the model. Additionally, the closer the value of is to 1, the more accurate the prediction accuracy of the reaction.

Except for the error correction model and the first decomposition prediction model, all single model and optimized parameter model predictions were run 10 times, and the final prediction value was the average value of the 10 times.

Table 2 shows some basic parameter information of the ELMAN, LSTM, and GRU models.

Table 2

Model parameter information

ModelMeaningValue
ELMAN Number of input layer nodes 
Number of hidden layer nodes 15 
Number of output layer nodes 
Epochs of training 2,000 
Learning rate 0–1 
Momentum coefficient 0–1 
Layer delays 1:2 
LSTM Number of input layer nodes 
Number of hidden 1 layer nodes 64 
Number of hidden 2 layer nodes 16 
Number of output layer nodes 
Epochs of training 200 
Batch_size 100 
GRU Number of input layer nodes 
Number of hidden 1 layer nodes 128 
Number of hidden 2 layer nodes 16 
Number of output layer nodes 
Epochs of training 200 
Batch_size 100 
ModelMeaningValue
ELMAN Number of input layer nodes 
Number of hidden layer nodes 15 
Number of output layer nodes 
Epochs of training 2,000 
Learning rate 0–1 
Momentum coefficient 0–1 
Layer delays 1:2 
LSTM Number of input layer nodes 
Number of hidden 1 layer nodes 64 
Number of hidden 2 layer nodes 16 
Number of output layer nodes 
Epochs of training 200 
Batch_size 100 
GRU Number of input layer nodes 
Number of hidden 1 layer nodes 128 
Number of hidden 2 layer nodes 16 
Number of output layer nodes 
Epochs of training 200 
Batch_size 100 

In the programming of this experiment, ELMAN is implemented in the MATLAB 2019 version, and the LSTM and GRU single models are implemented in the Python environment.

Comparison of the experimental results

  • (1)

    Forecast results of data 1

The number of data 1 is the lowest, and the prediction results are shown in Table 3. Figure 6 is a histogram of four prediction performance indicators and a comparison diagram of the prediction results of each model.

Table 3

Predicted performance index results of data 1

DataModelMAEMAPER2RMSE
Data 1 LSTM 0.88215 0.01263 0.99663 1.14522 
GRU 0.71005 0.01013 0.99741 0.91914 
MLP 0.90700 0.01345 0.99647 1.13766 
ELMAN 0.74666 0.01106 0.99712 0.93369 
VMD–ELMAN 0.47322 0.00715 0.99955 0.54591 
ICEEMDAN–ELMAN 0.58661 0.00982 0.99953 0.72816 
VMD–GRU 0.33959 0.00491 0.99971 0.40567 
GA–ELMAN 0.73590 0.01350 0.91050 0.87150 
VMD–GA–ELMAN 0.27725 0.00501 0.98450 0.36807 
VMD–GRU–VMD–ARIMA 0.14793 0.00212 0.99987 0.20134 
VMD–GA–ELMAN–VMD–ARIMA 0.01578 0.00029 0.99993 0.01930 
DataModelMAEMAPER2RMSE
Data 1 LSTM 0.88215 0.01263 0.99663 1.14522 
GRU 0.71005 0.01013 0.99741 0.91914 
MLP 0.90700 0.01345 0.99647 1.13766 
ELMAN 0.74666 0.01106 0.99712 0.93369 
VMD–ELMAN 0.47322 0.00715 0.99955 0.54591 
ICEEMDAN–ELMAN 0.58661 0.00982 0.99953 0.72816 
VMD–GRU 0.33959 0.00491 0.99971 0.40567 
GA–ELMAN 0.73590 0.01350 0.91050 0.87150 
VMD–GA–ELMAN 0.27725 0.00501 0.98450 0.36807 
VMD–GRU–VMD–ARIMA 0.14793 0.00212 0.99987 0.20134 
VMD–GA–ELMAN–VMD–ARIMA 0.01578 0.00029 0.99993 0.01930 
Table 4

Predicted performance index results of data 2

DataModelMAEMAPER2RMSE
Data 2 LSTM 1.57745 0.01151 0.95183 3.25710 
GRU 0.84443 0.00617 0.98327 1.90292 
MLP 1.37157 0.00989 0.96413 2.68813 
ELMAN 0.80781 0.00594 0.98539 1.77024 
VMD–ELMAN 0.45906 0.00331 0.99968 0.50775 
ICEEMDAN–ELMAN 0.47917 0.00340 0.99929 0.60761 
VMD–GRU 0.42670 0.00330 0.99958 0.52523 
GA–ELMAN 0.66633 0.00490 0.99541 1.17100 
VMD–GA–ELMAN 0.24775 0.00175 0.99974 0.32971 
VMD–GRU–VMD–ARIMA 0.18195 0.00136 0.99971 0.26206 
VMD–GA–ELMAN–VMD–ARIMA 0.10710 0.00076 0.99904 0.14040 
DataModelMAEMAPER2RMSE
Data 2 LSTM 1.57745 0.01151 0.95183 3.25710 
GRU 0.84443 0.00617 0.98327 1.90292 
MLP 1.37157 0.00989 0.96413 2.68813 
ELMAN 0.80781 0.00594 0.98539 1.77024 
VMD–ELMAN 0.45906 0.00331 0.99968 0.50775 
ICEEMDAN–ELMAN 0.47917 0.00340 0.99929 0.60761 
VMD–GRU 0.42670 0.00330 0.99958 0.52523 
GA–ELMAN 0.66633 0.00490 0.99541 1.17100 
VMD–GA–ELMAN 0.24775 0.00175 0.99974 0.32971 
VMD–GRU–VMD–ARIMA 0.18195 0.00136 0.99971 0.26206 
VMD–GA–ELMAN–VMD–ARIMA 0.10710 0.00076 0.99904 0.14040 
Figure 6

Prediction results of different model data 1.

Figure 6

Prediction results of different model data 1.

Close modal
Figure 7

Prediction results of different model data 2.

Figure 7

Prediction results of different model data 2.

Close modal

The experimental results show that in the single model, the prediction result of GRU is the best, the prediction result of ELMAN is second only to GRU, and the prediction effect of MLP is the worst. After data decomposition with the VMD and the ICEEMDAN models, the ELMAN model's prediction effect is superior to that of its single model, while the VMD model's prediction effect is also superior to that of the ICEEMDAN. However, after optimizing the parameters of ELMAN with GA, the values of and obtained are better than those of ELMAN, which are increasing by 0.01076 and 0.06219. The values of and show a difference of 0.00244 and 0.08662. The prediction accuracy of the VMD–GA–ELMAN model is higher than that of the VMD–ELMAN model and the GA–ELAMN model. Although the VMD–GRU model is superior to the VMD–ELMAN model, the VMD–GA–ELMAN–VMD–ARIMA model has the best prediction effect. Compared with the single-model ELMAN prediction, the , , , and values are increased by 0.73088, 0.01077, 0.00281, and 0.91439, respectively.

  • (2)

    Forecast results of data 2

The prediction results of data 2 are shown in Table 4. Figure 7 shows the histogram of four prediction performance indicators and the comparison of the prediction results of each model. Among the four single models, ELMAN has the best prediction effect, followed by GRU, and the VMD–ELMAN model is still better than the ICEEMDAN–ELMAN model. By comparing the prediction results of the ELMAN, the GA–ELAMN, and the VMD–GA–ELAMN models, it was found that the prediction accuracy of ELMAN can be improved by using data preprocessing and GA to optimize parameters. The VMD–GA–ELMAN–VMD–ARIMA model is worse than the VMD–GRU–VMD–ARIMA model, and the error correction has the ability to reduce the error.

  • (3)

    Forecast results of data 3

The number of data points is the largest, and the prediction results are shown in Table 5. Figure 8 shows the histogram of the four prediction performance indicators and the comparison of the prediction results of each model. In the single model, the prediction effect of ELMAN is the best, followed by GRU. The prediction results of the VMD–ELMAN model are still better than those of the ICEEMDAN–ELMAN model in , , and , but is the opposite, with a difference of 0.00004. The prediction effect after optimization is similar to that of data 2. The GA–ELMAN and VMD–GA–ELAMN models still have higher prediction accuracy than ELMAN. The VMD–GA–ELMAN–VMD–ARIMA model is worse than the VMD–GRU–VMD–ARIMA model, but it still maintains a good prediction effect, and the prediction effect is much higher than that of ELMAN.

  • (4)

    Scatter plot of experimental data

Figure 8

Prediction results of different model data 3.

Figure 8

Prediction results of different model data 3.

Close modal
Table 5

Predicted performance index results of data 3

DataModelMAEMAPER2RMSE
Data 3 LSTM 0.60941 0.01688 0.99132 0.88727 
GRU 0.51465 0.01439 0.99438 0.70720 
MLP 0.61394 0.01692 0.99088 0.87695 
ELMAN 0.49631 0.01386 0.99449 0.69381 
VMD–ELMAN 0.20425 0.00592 0.99946 0.25184 
ICEEMDAN–ELMAN 0.22961 0.00642 0.9995 0.27769 
VMD–GRU 0.17334 0.00503 0.99949 0.20777 
GA–ELMAN 0.47408 0.01324 0.99502 0.65681 
VMD–GA–ELMAN 0.17776 0.00390 0.99938 0.23451 
VMD–GRU–VMD–ARIMA 0.07790 0.00213 0.99988 0.10026 
VMD–GA–ELMAN–VMD–ARIMA 0.14167 0.00284 0.99957 0.19911 
DataModelMAEMAPER2RMSE
Data 3 LSTM 0.60941 0.01688 0.99132 0.88727 
GRU 0.51465 0.01439 0.99438 0.70720 
MLP 0.61394 0.01692 0.99088 0.87695 
ELMAN 0.49631 0.01386 0.99449 0.69381 
VMD–ELMAN 0.20425 0.00592 0.99946 0.25184 
ICEEMDAN–ELMAN 0.22961 0.00642 0.9995 0.27769 
VMD–GRU 0.17334 0.00503 0.99949 0.20777 
GA–ELMAN 0.47408 0.01324 0.99502 0.65681 
VMD–GA–ELMAN 0.17776 0.00390 0.99938 0.23451 
VMD–GRU–VMD–ARIMA 0.07790 0.00213 0.99988 0.10026 
VMD–GA–ELMAN–VMD–ARIMA 0.14167 0.00284 0.99957 0.19911 
Table 6

Percentage of model performance improvement

DataIndexesELMAN vs. Model
VMD–ELMAN (%)ICEEMDAN–ELMAN (%)GA–ELMAN (%)VMD–GA–ELMAN (%)VMD–GA–ELMAN–VMD–ARIMA (%)
Data 1  36.6218 21.4355 1.4411 62.8680 97.8866 
 35.3526 11.2116 −22.0615 54.7016 97.3779 
 0.2437 0.2417 −8.6870 −1.2656 0.28418 
 41.5320 22.0127 6.6607 60.5790 97.9329 
Data 2  43.1723 40.6828 17.5140 69.3307 86.7419 
 44.2761 42.7609 17.5084 70.5387 87.2054 
 1.4502 1.4106 1.0169 1.4563 1.3852 
 71.3174 65.6764 33.8508 81.3748 92.0689 
Data 3  58.8463 53.7366 4.4791 64.1837 71.4553 
 57.2872 53.6797 4.4733 71.8615 79.5094 
 0.5000 0.5038 0.5040 0.4917 0.5108 
 63.7019 59.9761 5.3329 66.1997 71.3019 
DataIndexesELMAN vs. Model
VMD–ELMAN (%)ICEEMDAN–ELMAN (%)GA–ELMAN (%)VMD–GA–ELMAN (%)VMD–GA–ELMAN–VMD–ARIMA (%)
Data 1  36.6218 21.4355 1.4411 62.8680 97.8866 
 35.3526 11.2116 −22.0615 54.7016 97.3779 
 0.2437 0.2417 −8.6870 −1.2656 0.28418 
 41.5320 22.0127 6.6607 60.5790 97.9329 
Data 2  43.1723 40.6828 17.5140 69.3307 86.7419 
 44.2761 42.7609 17.5084 70.5387 87.2054 
 1.4502 1.4106 1.0169 1.4563 1.3852 
 71.3174 65.6764 33.8508 81.3748 92.0689 
Data 3  58.8463 53.7366 4.4791 64.1837 71.4553 
 57.2872 53.6797 4.4733 71.8615 79.5094 
 0.5000 0.5038 0.5040 0.4917 0.5108 
 63.7019 59.9761 5.3329 66.1997 71.3019 

Figure 9 is the scatter plot diagram of the three data groups of data prediction results. The scatter diagram of the VMD–GA–ELMAN–VMD–ARIMA model is the closest to the regression line and is relatively uniform, while the prediction model is reasonable.

Figure 9

Scatter plot diagram of three groups of data.

Figure 9

Scatter plot diagram of three groups of data.

Close modal

Feasibility of parameter optimization

The learning rate can control the speed of updating the parameters of the neural network during training, and the learning rate is small, which can maintain the convergence speed and the stability of the neural network training (Yi 2015). However, if the learning rate is too large, it causes the instability of the network training. An appropriate momentum coefficient can quickly update the network weights to avoid the network falling into a local minimum (Masood et al. 2016; Narayanan et al. 2016; Wang et al. 2016). In this experiment, GA is used to optimize the values of the learning rate and the momentum coefficient, and two optimal values avoid the occurrence of the above situations.

When the original data are not preprocessed, the ELMAN parameters should be optimized, and the ELMAN parameters of each subsequence predicted after VMD decomposition should be optimized to compare the effect of parameter optimization and to illustrate the rationality of the proposed optimization method. From the experimental results, the learning rate and the momentum coefficient of ELMAN optimized by GA are better than those without optimization.

Model prediction performance improvement rate

Through the above experimental comparison, it is concluded that the prediction performance of the VMD–GA–ELMAN–VMD–ARIMA model is the best of in all nine models. Therefore, the prediction results of the ELMAN basic model are directly used as the standard. , , , and are the values of the ELMAN model prediction accuracy evaluation index, and , , , and are the values of comparison model.

Table 6 shows the percentage of the VMD–GA–ELMAN–VMD–ARIMA model to the ELMAN model. The evaluation index is defined as follows:
(32)
(33)
(34)
(35)
From the data analysis in Table 6, compared with the ELMAN based model, the VMD–GA–ELMAN–VMD–ARIMA model has the highest improvement rate on the whole. For example, the four values of , , , of data 1 in this model are 97.8866, 97.3779, 0.28418, and 97.9329%. The values of data 2 were 86.7419, 87.2054, 1.3852, and 92.0689%, respectively. The values of data 3 were 71.4553, 79.5094, 0.5108, and 71.3019%, respectively. From the perspective of decomposition, the improvement rate of the VMD–ELMAN model is significantly higher than that of the ICEEMDAN–ELMAN model, except that the value of data 3 is slightly lower than that of the ICEEMDAN–ELMAN model. From the perspective of optimization, the GA–ELMAN model improves the overall improvement rate compared with the ELMAN model, and the values of and except data 1 are lower than those of the ELMAN model. This shows that adding the optimization algorithm does not mean that all the prediction accuracy will be improved, but the three experimental results on the whole, adding the GA can improve the prediction accuracy. This group of data can obtain the proposed VMD–GA–ELMAN–VMD–ARIMA combination model, which has great potential in improving the prediction accuracy.

The accurate prediction of the water level can have a very important impact on dams, embankments, agriculture, and navigation. At the same time, it can reduce its adverse impact to some extent and improve people's utilization of water. This paper proposes a combination model of the VMD double processing, a genetic optimization algorithm, and an ELMAN prediction, while using the VMD–ARIMA model to correct the error. In the proposed VMD–GA–ELMAN–VMD–ARIMA model, the VMD is used to decompose the original water level series, and then the GA–ELMAN model is used to predict each subseries. At this time, the prediction series and error series are obtained. Then, the VMD–ARIMA model is used to predict the error sequence to obtain the prediction error sequence. Finally, adding the prediction error sequence to the prediction sequence is the final prediction result. Three groups of water level data are used to verify the model, and the performance improvement rate is compared with the other five models. The following conclusions are drawn:

  • (1)

    The prediction performance of the combined model is better than that of the single model. Data preprocessing by the VMD can reduce the prediction error caused by noise fluctuation in the water level sequence to effectively improve the prediction accuracy. Among the three groups of experimental data, the prediction accuracy of VMD decomposition is higher than that of the ICEEMDAN model. It can be seen that the subsequence of VMD decomposition is more stable.

  • (2)

    The proposed GA optimizes the learning rate and the momentum coefficient of ELMAN, which is more accurate than the prediction effect of the ELMAN single model. On the first set of data, the values of and predicted by GA optimization are lower than those of ELAMN, but the values of and are higher than those of ELAMN. In the second and third groups of experiments, the GA–ELAMN model is more stable than ELMAN, and the VMD–GA–ELMAN model is more stable than VMD–GA, which shows that GA optimization parameters can also improve the prediction accuracy.

  • (3)

    The VMD–ARIMA model is used to correct the error sequence predicted by the VMD–GA–ELMAN and VMD–GRU models. Three groups of experimental data show that the error correction model can effectively improve the prediction performance, and the VMD–GA–ELMAN–VMD–ARIMA model has the most accurate prediction accuracy, while the VMD double processing has the ability to considerably improve the prediction accuracy, which also verifies that the proposed model is reasonable and feasible.

Although the prediction performance of the model is satisfactory, other factors affecting the water level, such as precipitation and flow, are not considered. In future studies, we will consider other factors that affect water levels.

W.-Y.X. conceptualized the whole article, developed the methodology, wrote the original draft, and developed software and conducted investigation. Y.-L.B. conceptualized the whole article, supervised the work, conducted funding acquisition, and wrote the review and edited the article. L.-D. investigated the work and developed software. Q.-H.Y. validated the article and conducted data curation. W.-S. visualized the article. All authors read and agreed to the published version of the manuscript.

This research was funded by the NSFC (National Natural Science Foundation of China) project (grant no. 41861047, 62066041, and 41461078). We are thankful to the reviewers whose constructive comments helped significantly to improve this work.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abrahart
R. J.
,
Anctil
F.
,
Coulibaly
P.
,
Dawson
C. W.
,
Mount
N. J.
,
See
L. M.
,
Shamseldin
A. Y.
,
Solomatine
D. P.
,
Toth
E.
&
Wilby
R. L.
2012
Two decades of anarchy? Emerging themes and outstanding challenges for neural network river forecasting
.
Progress in Physical Geography
36
(
4
),
480
513
.
Behzad
M.
,
Asghari
K.
,
Eazi
M.
&
Palhang
M.
2009
Generalization performance of support vector machines and neural networks in runoff modeling
.
Expert Systems with Applications
36
(
4
),
7624
7629
.
Blschl
G.
,
Bierkens
M.
,
Chambel
A.
,
Cudennec
C.
&
Zhang
Y.
2019
Twenty-three unsolved problems in hydrology (UPH)-a community perspective
.
Hydrological Sciences Journal/Journal des Sciences Hydrologiques
64
(
10
),
1141
1158
.
Brunelli
U.
,
Piazza
V.
,
Pignato
L.
,
Sorbello
F.
&
Vitabile
S.
2007
Two-days ahead prediction of daily maximum concentrations of SO2, O3, PM10, NO2, CO in the urban area of Palermo, Italy
.
Atmospheric Environment
41
(
14
),
2967
2995
.
Cho
K.
,
Merrienboer
B. V.
,
Gulcehre
C.
,
Ba Hdanau
D.
,
Bougares
F.
,
Schwenk
H.
&
Bengio
Y.
2014
Learning phrase representations using RNN encoder-decoder for statistical machine translation
.
Computer Science
.
arXiv:1406.1078v3
Colominas
M. A.
,
Schlotthauer
G.
&
Torres
M. E.
2014
Improved complete ensemble EMD: a suitable tool for biomedical signal processing
.
Biomedical Signal Processing and Control
14
,
19
29
.
Cong
L. N.
&
Meesad
P.
2013
Meta-heuristic algorithms applied to the optimization of type-1 and type 2 TSK fuzzy logic systems for sea water level prediction
. In:
2013 IEEE 6th International Workshop on Computational Intelligence and Applications
, IWCIA 2013.
IEEE
New York, pp. 69–74.
Dragomiretskiy
K.
&
Zosso
D.
2014
Variational mode decomposition
.
IEEE Transactions on Signal Processing
62
(
3
),
531
544
.
Ebtehaj
I.
,
Shauket
S.
,
Sidek
L. M.
,
Malik
A.
,
Chau
K. W.
&
Bonakdari
H.
2021
Prediction of daily water level using new hybridized GS-GMDH and ANFIS-FCM models
.
Engineering Applications of Computational Fluid Mechanics
15
(
1
),
1343
1361
.
Ekwueme
B. N.
&
Agunwamba
J. C.
2021
Trend analysis and variability of air temperature and rainfall in Regional River Basins
.
Civil Engineering Journal
7
,
816
826
.
Elman
J. L.
1990
Finding structure in time
.
Cognitive Science
14
,
179
211
.
Fotovatikhah
F.
,
Herrera
M.
,
Shamshirband
S.
,
Chau
K. W.
,
Faizollahzadeh Ardabili
S.
&
Piran
M. J.
2018
Survey of computational intelligence as basis to big flood management: challenges, research directions and future work
.
Engineering Applications of Computational Fluid Mechanics
12
(
1
),
411
437
.
Fu
M.
,
Fan
T.
,
Ding
Z.
,
Salih
S. Q.
&
Yaseen
Z. M.
2020
Deep learning data-intelligence model based on adjusted forecasting window scale: application in daily streamflow simulation
.
IEEE Access
8
(
1
),
32632
32651
.
Habeeb
N. J.
&
Talib
S.
2021
Combination of GIS with different technologies for water quality: an overview
.
HighTech and Innovation Journal
2
(
3
),
262
272
.
Hestenes
M. R.
1969
Multiplier and gradient methods
.
The Journal of Optimization Theory and Applications
4
(
5
),
303
320
.
Hochreiter
S.
&
Schmidhuber
J.
1997
Long short-term memory
.
Neural Computation
9
(
8
),
1735
1780
.
Kalteh
A. M.
2016
Improving forecasting accuracy of streamflow time series using least squares support vector machine coupled with data-preprocessing techniques
.
30
(
2
),
747
766
.
Khan
M.
,
Muhammad
N. S.
&
El-Shafie
A.
2020
Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting
.
Journal of Hydrology
590
,
125380
.
Khu
S. T.
,
Liong
S. Y.
,
Babovic
V.
,
Madsen
H.
&
Muttil
N.
2010
Genetic programming and its application in real-time runoff forecasting
.
JAWRA Journal of the American Water Resources Association
37
(
2
),
13
.
Kiafar
H.
,
Babazadeh
H.
,
Marti
P.
,
Kisi
O.
&
Shiri
J.
2016
Evaluating the generalizability of GEP models for estimating reference evapotranspiration in distant humid and arid locations
.
Theoretical & Applied Climatology
130
(
1
),
377
389
.
Kisi
O.
,
Shiri
J.
,
Karimi
S.
,
Shamshirband
S.
,
Motamedi
S.
,
Petkovi
D.
&
Hashim
R.
2015
A survey of water level fluctuation predicting in Urmia Lake using support vector machine with firefly algorithm
.
Applied Mathematics and Computation
270
(
C
),
731
743
.
Lei
L.
&
Wang
C.
2019
Comparison and application of three prediction models based on BP, ELMAN and PSO – SVR in Shiyang River Basin
.
China Rural Water and Hydropower
9
,
28
32
.
Liu
H.
,
Tian
H.
,
Liang
X.
&
Li
Y.
2015
New wind speed forecasting approaches using fast ensemble empirical model decomposition, genetic algorithm, mind evolutionary algorithm and artificial neural networks
.
Renewable Energy
83
(
1066e
),
1066
1075
.
Masood
S.
,
Doja
M. N.
&
Chandra
P.
2016
Analysis of weight initialization methods for gradient descent with momentum
. In:
2015 International Conference on Soft Computing Techniques and Implementations (ICSCTI)
.
IEEE
.
Nazarnia
H.
,
Nazarnia
M.
,
Sarmasti
H.
&
Wills
W. O.
2020
A systematic review of civil and environmental infrastructures for coastal adaptation to sea level rise
.
Civil Engineering Journal
6
(
7
),
1375
1399
.
Niu
W. J.
,
Feng
Z. K.
,
Chen
Y. B.
,
Zhang
H. R.
&
Cheng
C. T.
2020
Annual streamflow time series prediction using extreme learning machine based on gravitational search algorithm and variational mode decomposition
.
Journal of Hydrologic Engineering
25
(
5
),
04020008
.
Pan
M.
,
Zhou
H.
,
Cao
J.
,
Liu
Y.
&
Chen
C.
2020
Water level prediction model based on GRU and CNN
.
IEEE Access
8
,
60090
60100
.
Rao
S. G.
2000
Artificial neural networks in hydrology. I: preliminary concepts
.
Journal of Hydrologic Engineering
5
(
2
),
115
123
.
Shiri
J.
,
Shamshirband
S.
,
Kisi
O.
,
Karimi
S.
,
Bateni
S. M.
,
Nezhad
S.
&
Hashemi
A.
2016
Prediction of water-level in the Urmia lake using the extreme learning machine approach
.
Water Resources Management
30
(
14
),
1
13
.
Taormina
R.
&
Chau
K.
2015
ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS
.
Engineering Applications of Artificial Intelligence
45
,
429
440
.
Tyralis
H.
,
Papacharalampous
G. A.
,
Burnetas
A.
&
Langousis
A.
2019
Hydrological post-processing using stacked generalization of quantile regression algorithms: large-scale application over CONUS
.
Journal of Hydrology
177
,
123957
.
Wang
W. C.
,
Chau
K. W.
,
Xu
D. M.
&
Chen
X. Y.
2015
Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition
.
Water Resources Management
29
(
8
),
2655
2675
.
Wang
L.
,
Liu
H.
,
Feng
C.
,
Chen
D.
&
Qishuai
F.
2016
Identification of flow regimes based on adaptive learning and additional momentum BP neural network
. In:
Sixth International Conference on Instrumentation & Measurement
.
IEEE
,
New York
, pp.
574
578
.
Wang
L. L.
,
Li
X.
,
Ma
C. F.
&
Bai
Y. L.
2019
Improving the prediction accuracy of monthly streamflow using a data-driven model based on a double-processing strategy
.
Journal of Hydrology
573
,
733
745
.
Wang
L. L.
,
Li
X.
,
Ran
Y. H.
&
Guo
Y. L.
2020
Monthly runoff prediction of Zhengyixia in the Heihe river based on singular spectrum analysis-grey wolf optimizer-support vector regression hybrid model
.
Remote Sensing Technology and Application
35
(
2
),
355
364
.
Wen
X. H.
,
Feng
Q.
,
Deo
R. C.
,
Wu
M.
,
Yin
Z. L.
,
Yang
L. S.
&
Singh
V. P.
2019
Two-phase extreme learning machines integrated with the complete ensemble empirical mode decomposition with adaptive noise algorithm for multi-scale runoff prediction problems
.
Journal of Hydrology
570
,
167
184
.
Wu
C. L.
&
Chau
K. W.
2013
Prediction of rainfall time series using modular soft computing methods
.
Engineering Applications of Artificial Intelligence
26
(
3
),
997
1007
.
Xi
D. J.
,
Zhao
X. H.
,
Zhang
Y. B.
,
Zheng
X. Q.
,
Zhu
X. P.
&
Wang
Y.
2017
Monthly runoff prediction based on empirical mode decomposition and Elman neural networks
.
China Rural Water and Hydropower
112–115.
Xu
G. Y.
,
Zhou
X. Y.
,
Si
C. Y.
,
Hu
W. B.
&
Liu
F.
2020
Prediction model of water level time series based on GRU and LightGBM feature selection
.
Computer Applications and Software
37
(
2
),
8
.
Yan
K.
,
Wang
X.
,
Du
Y.
,
Jin
N.
,
Huang
H.
&
Zhou
H.
2018
Multi-step short-term power consumption forecasting with a hybrid deep learning strategy
.
Energies
11
(
11
),
3089
.
Yao
Z.
,
Ji-Ping
X. U.
,
Kong
J. L.
&
Liu
S. B.
2018
Prediction of river water level by GA-ELMAN model
.
Journal of Yangtze River Scientific Research Institute
35
(
09
),
34
37
.
Yi
X.
2015
Selection of initial weights and thresholds based on the Genetic Algorithm with the optimized Back-Propagation neural network
. In:
International Conference on Fuzzy Systems & Knowledge Discovery
.
IEEE
,
New York
, pp.
173
177
.
Zhang
W.
,
Qu
Z.
,
Zhang
K.
,
Mao
W.
,
Ma
Y.
&
Fan
X.
2017
A combined model based on CEEMDAN and modified flower pollination algorithm for wind speed forecasting
.
Energy Conversion and Management
136
,
439
451
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).