This study proposes a hybrid model based on the combination of Sand Cat Swarm Optimization (SCSO), Echo State Network (ESN), Gated Recurrent Unit (GRU), Least Squares Method (LSM), and Markov Chain (MC) to improve the accuracy of annual runoff prediction. Firstly, conduct correlation analysis on multi-factor data related to runoff to determine the input of the model. Secondly, the SCSO algorithm is used to optimize the parameters of the ESN and GRU models, and the SCSO-ESN and SCSO-GRU models are established. Next, the prediction results of these two models are coupled using LSM to obtain the preliminary prediction results of the SCSO-ESN-GRU model. Finally, the initial prediction results are corrected for errors using MC to get the final prediction results. Choose Changshui Station and Lanxi Station for experiments, and evaluate the predictive performance of the model through five evaluation indicators. The results show that the combined prediction model corrected by the MC achieved the optimal prediction performance at both experimental stations. This study emphasizes that using a combination prediction model based on Markov chain correction can significantly improve the accuracy of annual runoff prediction, providing a reliable basis for predicting annual runoff in complex watersheds.

  • A new hybrid sand cat swarm optimization-echo state network-gated recurrent unit-Markov chain (SCSO-ESN-GRU-MC) model is proposed for annual runoff prediction.

  • A new SCSO algorithm is used to optimize the parameters of the prediction model.

  • Least squares method is used to couple the ESN and GRU models for annual runoff prediction.

  • MC is used to correct errors in annual runoff prediction results.

  • The proposed model achieves the best prediction results at both experimental stations.

The sustainable development of water resources has become a global focus, and how to plan and manage the uneven distribution and shortage of water resources has emerged as a pressing issue that requires attention (Wei & Xun 2019). In this severe context, runoff, as an essential component of the hydrological cycle on Earth, has attracted much attention. For the best possible management of water resources, flood prevention, disaster relief, and water environment conservation, accurate and trustworthy runoff prediction is essential (Wang et al. 2009; Xu et al. 2023). Medium- to long-term forecasting focuses on the scale of the coming months and years, which is crucial for long-term water resource planning, management, and development (Bian et al. 2023). Through in-depth research on medium- to long-term forecasting, we can have a more comprehensive understanding and prediction of the trend of watershed runoff changes, providing strong technical support for achieving sustainable utilization and protection of water resources. Therefore, precise and dependable mid- to long-term runoff forecast models are essential.

The current models for medium- to long-term runoff prediction are mostly process-driven based on physical models and data-driven based on historical statistics (Wang et al. 2013; Han & Morrison 2022a). The process-driven model is founded on a deep understanding of hydrological processes, using mathematical equations to describe various physical processes of the hydrological system, such as rainfall, snow melting, evaporation, and soil moisture movement (Partington et al. 2012). Process-driven models typically require a huge quantity of data and computational resources and are complex in parameter determination and model calibration. Therefore, long-term runoff prediction has many limitations, such as annual runoff (Yoon et al. 2011; He et al. 2023). Data-driven models are trained and learned using huge amounts of data, and their design and prediction capabilities mainly depend on input data characteristics (Min et al. 2023). The reason why data-driven models are valued in the field of runoff prediction is that they can effectively predict by analyzing and learning the relationship between input and output data, without relying on a deep understanding of the complex physical processes of hydrological systems. The ease of use and accuracy of this method make it particularly popular in predicting runoff (Parisouj et al. 2020).

Time series, machine learning, neural networks, and deep learning models are examples of data-driven models. The Autoregressive Integrated Moving Average (ARIMA) model is one of the representative time series prediction models (Liu et al. 2012). Adnan et al. (2018) used the Least Squares Support Vector Machine (LSSVM) model to predict daily and monthly runoff at three sites, resulting in better results than the M5 model tree. Although time series and machine learning models have shown outcomes in runoff prediction, neural network models have more advantages in nonlinear modeling, parameter adaptation, and flexibility than the above two models. Neural network models such as the Elman neural network (ENN) (Song et al. 2021), backpropagation (BP) neural network (Chen et al. 2023), and echo state network (ESN) (Li & Li 2023) have been well developed in predicting future trends. Qin et al. (2019) used a BP model based on preprocessing technology to predict runoff at different scales and found that the prediction results were improved. The evapotranspiration (Ev) of green peppers was indicated by Liu et al. (2020) using the ENN. When compared with the BP network, which involves iterative optimization of BP, and the ENN, which deals with long-term dependencies facing gradient vanishing or exploding, the ESN model's training procedure just involves linear regression of the output layer without the need to train the internal weights of the network, making the training process more efficient (Ribeiro et al. 2020). Meanwhile, the network structure of ESN is relatively simple, requiring only the initialization of a set of weight matrices and state vectors. This makes ESN easy to implement and understand and can be more easily deployed and applied in resource-limited situations (Wen et al. 2023). Due to its significant advantages, the ESN model has been widely used to handle time series prediction problems. Coulibaly (2010) predicted lake water levels using the ESN model, and results from four locations showed that the ESN model outperformed recurrent neural network (RNN) and Bayesian neural network. The ESN model was used by Li et al. (2023) to increase the prediction accuracy of photovoltaic power. The efficiency of neural network models steadily weakens with increasing dimensions and quantity of data; hence, deep learning models have garnered significant interest and have been continuously developed. Currently, deep learning models such as Long and Short-Term Memory (LSTM) (Tennant et al. 2020), RNN (Bhattacharjee & Tollner 2016), and gated recurrent unit (GRU) (Wei et al. 2023) have been widely used for prediction. Meng et al. (2023) used optimized RNN to predict sea ice's bending and compressive strength, and the results showed higher prediction accuracy than BP. Xu et al. (2022) used LSTM for flood forecasting in two watersheds. Using hourly flow and rainfall as inputs, Gao et al. (2020) predicted watershed runoff with high prediction accuracy using LSTM and GRU. Although both RNN and LSTM have their advantages, GRU effectively solves classic RNN's gradient vanishing and gradient explosion concerns in modeling long sequence dependencies through gating mechanisms (Gao et al. 2022; Liu et al. 2023). It can better capture long-term dependencies in sequences by selectively updating and forgetting information. At the same time, GRU has a lower computational complexity than LSTM, and its structure is simpler with fewer parameters. Therefore, in the case of limited computing resources, using GRU can more efficiently train and predict (Zhou et al. 2023; Fang et al. 2024). In summary, this paper selects ESN model and GRU model with significant advantages as the annual runoff prediction model with multiple factor inputs.

However, due to the significant impact of hyperparameters in ESN and GRU models, on the effectiveness of runoff prediction, traditional parameter determination methods are gradually phased out due to their manpower, material resources, and intense subjectivity (Wang et al. 2023). Swarm intelligence optimization algorithms, which feature benefits including great adaptability to nonlinear problems and global search ability, have been utilized by several researchers for parameter optimization as technology has advanced (Niu et al. 2019). The artificial neural network (ANN) parameters were optimized by Jahandideh-Tehrani et al. (2021) using the genetic algorithm (GA) method and particle swarm optimization (PSO). Through daily rainfall and runoff prediction, they were able to confirm that the ANN-PSO model outperforms the ANN-GA model by a large margin. In comparison to independent Extreme Learning Machine (ELM), a single ELM-PSO, and ELM-GWO, Adnan et al. (2021) achieved better accurate findings when using ELM based on the Grey Wolf Optimizer (GWO) and PSO to estimate monthly runoff in the Mangla Basin. Qiao et al. (2023) used an improved Aquila optimizer to optimize a time convolutional network for rainfall–runoff simulation and multistep runoff prediction and achieved good results. Inspired by the hunting behavior of sand cats in the desert, the sand cat swarm optimization algorithm (SCSO) is a new optimization algorithm (Seyyedabbasi & Kiani 2023). The optimization algorithm has strong global search ability, fast convergence speed, a simple structure, and can adaptively adjust the parameters in the algorithm according to the characteristics of the problem. The computation process of the simultaneous SCSO algorithm can handle multiple solutions simultaneously. This makes it feasible to use parallel computing to accelerate the algorithm's execution in the large-scale predictive model's parameter optimization challenge (Jia et al. 2022; Tavakol Aghaei et al. 2023). To solve the critical parameter optimization problem of ESN and GRU models, this paper adopts the SCSO algorithm for processing, aiming to enhance the predictive performance of ESN and GRU models.

Although a single optimized prediction model generated outstanding prediction results, the prediction accuracy of numerous complex prediction data is not ideal due to the structural limitations of each prediction model. Therefore, some scholars attempt integrating models with complementary performance to achieve better prediction results. Yao et al. (2023) optimized the parameters of the Convolutional Neural Networks (CNN)-LSTM and GRU models using an improved Sparrow Search Algorithm (SSA) technique. Using an adaptive weighting module, they combined the two models' predictions. Results indicate that the combination model with several prediction models outperforms other single prediction models. The observed displacement of the Three Gorges slope was projected by Duan (2023) using the Grey–Markov, GA-BP, and ARIMA models. The anticipated outcomes of the three models were then blended using the reciprocal variance weighting approach. The outcomes demonstrated that the weighted combination prediction model produced more accurate predictions. It is evident from the aforementioned researchers' studies that high prediction accuracy may be attained by integrating prediction models utilizing weighted combination approaches. Thus, to enhance prediction accuracy by balancing the benefits of both models, this work integrates the prediction outcomes of the GRU and ESN models based on SCSO using the least squares method (LSM).

There are errors between the weighted combination model's predicted outcomes and the original data, but these errors frequently hold a wealth of information that is overlooked. This ignores the potential critical features in the error, affecting the prediction results' accuracy (Liu et al. 2021; Yang et al. 2023). Many research studies have demonstrated that developing error correction models can improve the prediction accuracy. Wang et al. (2024) used statistical methods to extract residual linear features from wind power prediction error sequences for adequate correction. Wu et al. (2023) predicted and corrected the error between the forecasted and actual values of the LSTM model linked with preprocessing technologies using an improved coupling model of CEEMDAN (ICEEMDAN) and GRU. The research showed that the crude oil future prices converted by the error correction model were closer to the real values. At three hydrological stations, Xu et al. (2024) employed local error correction to rectify the prediction error of the ENN optimized by the sparrow search algorithm. The outcomes demonstrated the benefits of the error correction model. Markov chains (MCs) only rely on the current state for future states and are independent of past conditions. This ‘memorylessness’ makes them suitable for problems with high-frequency random fluctuations such as error sequences (Gao et al. 2021; Wei et al. 2022). At the same time, the finite space state of MCs and the modeling of the evolution of system states as probability transitions between states simplify the design and adjustment process of error correction models and reduce the computational burden of error correction (Zhang et al. 2021). Given the advantages of error correction, this work makes full use of the MC model to accurately correct the error sequence in runoff prediction, hence enhancing the precision and dependability of the model's forecast.

Therefore, this study aims to develop an SCSO-ESN-GRU-MC hybrid model based on a new optimization algorithm, multiple prediction model combinations, and error correction methods. In addition, this model is applied to the Changshui Station in the Luohe River Basin and Lanxi Station in the Hulan River Basin's annual runoff forecast.

The second section's goal is to give a thorough overview of each model's underlying theories and coupling techniques, as well as the metrics used to assess the models' performance and the precise steps involved in building the suggested model.

Echo state network

A novel kind of RNN called an ESN is extensively employed in domains like natural language processing, robot control, and pattern recognition. Its main benefit lies in its ability to process time series prediction. An input layer, a hidden layer (a reserve pool), and an output layer make up the ESN. The hidden layer, which consists of numerous sparsely connected neurons with short-term memory functions, is the central component of the ESN. Its structure is shown in Figure 1. If an ESN has K input units, N hidden layer neurons, and L output units, then their values at time k are
(1)
Figure 1

Schematic diagram of the ESN structure.

Figure 1

Schematic diagram of the ESN structure.

Close modal
The internal units are linked to the training samples, and the training data are collected, enabling the reserve pool to remember the corresponding information. The goal is to minimize the average error through the process of a linear regression between the target output and the reserve pool state vector. The learning steps of ESN are as follows:
(2)
(3)

In the equation, u, x, and v are the input variables, output variables, and state variables of the ESN, respectively; are the input weight matrix, internal weight matrix, and output weight matrix, respectively, while is the interior to output connection weight matrix; f represents the internal activation function, usually using the Sigmoid function or tanh function; is the output layer activation function, generally using a linear process.

The performance of an ESN is determined by the spectrum radius (SR), input scale (IS), reserve pool size (N), and sparse degree (SD) parameters of the reserve pool. The specific definitions of these parameters can be found in the papers of Ribeiro et al. (2020) and Wen et al. (2023).

Gated recurrent unit

The GRU and LSTM are variant networks improved by RNN, solving the gradient vanishing problem caused by long dependencies and BP in sequence data. Compared to the intricate structural form of LSTM, GRU drastically reduces the computational complexity of the model while simultaneously saving time and computational resources. This is achieved by substituting update gates for the forgetting and input gates in LSTM. Figure 2 depicts the fundamental architecture of GRU.
Figure 2

Schematic diagram of the GRU structure.

Figure 2

Schematic diagram of the GRU structure.

Close modal
GRU solves the long-term dependency problem of traditional RNN through two gating mechanisms: reset gate and update gate. The reset gate output vector and the update gate output vector are represented as follows:
(4)
(5)

Among them, represents the sigmoid function with output values limited to [0,1]; and represent the reset gate weight matrix and update gate weight matrix for linear transformation of the input vector in the tth time step, respectively; and represent the reset gate weight matrix and update gate weight matrix for linear change of the hidden state in the previous time step, respectively. The reset gate controls the degree of neglect of the prior sequence of state information. The smaller the value, the more information is ignored. The update gate holds the degree to which the previous line of state information is transmitted to the current state, and the larger its value, the more information is input.

The current hidden state in the unit is shown in Equation (6).
(6)
where W and U represent the update gate weight matrices for the input vector and hidden state , respectively, and represents the Hadamard product.
The network unit's hidden state output will retain the current unit's information and pass it on to the next team. The final hidden state output is as follows:
(7)

Sand cat swarm optimization

Seyyedabbasi & Kiani (2023) proposed the SCSO algorithm, a population optimization algorithm that uses a unique, straightforward, and effective methodology. The hunting technique of a pack of sand cats, which are incredibly skilled at mining prey and can detect low frequencies below 2,000 Hz, is modeled by this algorithm. The SCSO algorithm splits the feeding behavior of sand cats into two phases based on these two characteristics: global search (exploration) and attack prey (development). It maintains a balance between exploration and development through adaptive mechanisms. The following describes the mathematical description of SCSO:

  • (1) Global search phase. Each sand cat has a workable solution X, and we set the number of sand cat populations to N. Next, we choose the best individual by assessing each sand cat's fitness for the goal function. Based on where the ideal individual is located, other people move. The equation for updating the position of the kth sand cat group in the i + 1 generation is as follows:
    (8)

Among them is the i-generation optimal individual; r is the sensitivity range of each sand cat , and rand is a random number between 0 and 1; linearly decreases from 2 to 0 during the iteration process to ensure that the feasible solution gradually approaches the prey, , represents the auditory characteristics of the sand cat (), and T represents the maximum number of iterations for the sand cat population.

By establishing the balance parameter R for the change from exploration to development, SCSO keeps a balance between the two. The mathematical description is as follows:
(9)
where is the exploration phase and is the development phase.
  • (2) Attack the prey stage. At this stage, to determine a random angle at which each sand cat should update its position to approach its prey, SCSO employs a roulette selection algorithm. The updated equation is as follows:
    (10)
    where d is the random distance and θ is a random angle to approach the hunting position from any angle.

Global search and attack of prey are determined by and R. SCSO can seamlessly switch between two stages. When , the sand cat herd's next position can be anywhere between its current position and the hunting position, allowing for a global search to be done. When , the sand cat group enters the attack stage and conducts local searches, and the algorithm has global and local search capabilities.

Least squares method

A popular optimization method in mathematics and statistics is LSM, which reduces model errors and finds the best-fitting model by minimizing the sum of squares of residuals. The process of using LSM to determine the weights of the composite model in this paper is as follows:

Objective equation:
(11)
Constraint condition:
(12)

In the equation, i represents the type of prediction model, k represents the number of prediction models, and represents the measured value.

Equation (13) can be used to get the predicted values of the combined model.
(13)
where Y represents the predicted value of the combination model; and represent the weights of the SCSO-ESN model and the SCSO-GRU model, respectively; and describe the predicted values of the SCSO-ESN model and the SCSO-GRU model, respectively.

MC process

MC is a stochastic dynamic process with the characteristic that future states do not depend on past conditions. It can compute the future form at a specific time based on the initial known state through the state transition probability matrix. Large random fluctuation issues can be effectively handled by it.

Finding the research object's state partition and transition probability is essential to understanding the MC correction model. The following are the precise steps for correction:

  • (1) Judgment status interval.

Based on the absolute value of the relative error between the measured and predicted values of the training set samples, divide the states of k sample space.

  • (2) The state transition probability matrix needs to be solved.

The absolute relative error values of the model's prediction results are counted to create the Markov state transition table. Equation (14) displays the computation equation for the Markov state transition probability matrix.
(14)
(15)

Among them, is the probability of state transitioning to in one step, is the number of samples in the state transition table, and is the frequency at which transfers to in one step.

  • (3) Solve residual correction.

According to steps (1) and (2), the prediction error of the state is obtained, and the correction error of the state is received according to the following equation:
(16)

is the correction error of state ; and are the upper and lower bounds of the interval of state , respectively.

The predicted value error is located in the state with the maximum sum of possibilities, which is determined by adding the probabilities of each state in the state transition probability matrix obtained. The MC correction equation can be expressed as
(17)

represents the prediction results after MC correction, and represents the initial prediction results of the model.

Cross-validation

Cross-validation is a technique for evaluating the generalization ability of machine learning models, which reduces overfitting and assists in effective model selection and parameter tuning by dividing the training and validation sets (Jiang & Chen 2016; Rohani et al. 2018). There are various types of cross-validation, and in this study, we used the fivefold cross-validation to divide the training and validation sets and select model parameters.

The fivefold cross-validation is a technique of dividing the original sample randomly into five subsamples. Then, a single subsample is regarded as the validation data for testing the model, and the remaining four subsamples are used as training data. This process is repeated five times, and each of the five subsamples is used exactly once as the validation data. The five results from the folds can then be averaged to produce a single estimation. The specific process of fivefold cross-validation is shown in Figure 3. An example of estimating parameter α with fivefold cross-validation is as follows:
  • Step 1: Divide the data into five equal parts.

  • Step 2: For each i = 1, 2, 3, 4, 5, fit the model with parameter α to the other four parts, giving and computing its root mean square error in predicting the fifth part.
    (18)
Figure 3

Five fold cross-validation flowchart.

Figure 3

Five fold cross-validation flowchart.

Close modal
Here, h represents yi numerical values.
This gives the cross-validation error.
(19)
  • Step 3: Do this for many values of α and choose the value of α that makes the smallest.

Model performance evaluation indicators

Five evaluation indicators are chosen to assess the accuracy and performance of the model presented in this paper: the Nash–Sutcliffe efficiency coefficient (NSEC), correlation coefficient (R), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The NSEC value shows the quality of the model simulation results, and higher NSEC values indicate the higher simulation quality of the model (Han & Morrison 2022b). Higher predictive accuracy is indicated by a model with an R-value closer to 1, which measures the linear correlation between the model's actual value and its predicted value. The similarity between the predicted results of the model and actual values is displayed using MAE, with smaller MAE values indicating that the model is closer to the actual data. The relative size of the error is indicated by the MAPE, which is the ratio of error to actual value. A smaller MAPE value indicates the model's prediction is relatively accurate (Lee 2022). RMSE measures the degree of deviation between a model's predicted and actual values. The model's prediction error decreases, and the fitting effect improves as the closer the value is to 0. The closer the NSEC and R-values are to 1, the higher the model's prediction accuracy; conversely, the closer the RMSE, MAPE, and MAE values are to 0, the smaller the model's prediction error and the better the fitting effect. The mathematical equations for the five evaluation indicators are shown below:
(20)
(21)
(22)
(23)
(24)

In the equations, is the measured value; is the predicted value, favg is the average of the measured values, and is the average of the predicted values.

The modeling process of the proposed model

To address the model's poor prediction accuracy and instability, this paper suggests the SCSO-ESN-GRU-MC coupling model, the basic workings of which are depicted in Figure 4. The specific steps are as follows:
  • Step 1: Perform Pearson correlation analysis on the collected multifactor data and select factors with high correlation with runoff factors as input for the prediction model.

  • Step 2: Normalize the screening input factors and divide each into training and prediction sets in a 9:1 ratio.

  • Step 3: Establish parameters like the number of SCSO populations and maximum iteration times, and choose the four critical parameters of ESN and the number of GRU hidden layer nodes as the optimization objects for SCSO for population initialization.

  • Step 4: Using the fivefold cross-validation method, the training set is divided into five parts, with each part taking turns as the validation set and the remaining four parts as the training set. The average target value is obtained by cycling five times. Based on the average target value, adjust the position of the leader in the SCSO algorithm and update the position parameters for continuous optimization. Repeat the above process until the SCSO algorithm reaches the optimal target value, and the ESN and GRU models also determine the optimal parameters based on the results of the validation set.

  • Step 5: Next, use SCSO-ESN and SCSO-GRU to model and predict the multifactor data after assigning the determined optimal parameter values to ESN and GRU.

  • Step 6: Fit the predicted results of SCSO-ESN and SCSO-GRU using the LSM to obtain weights λSCSO-ESN and λSCSO-GRU. The initial prediction results are obtained by weighting and summing the SCSO-ESN and SCSO-GRU prediction results based on their weights.

  • Step 7: The relative error sequence can be obtained by computing the error sequence, which can be obtained by subtracting the original runoff sequence from the initial prediction sequence.

  • Step 8: Based on the relative error sequence, perform MC correction on the initial prediction result to obtain the final prediction result.

Figure 4

SCSO-ESN-GRU-MC model construction process.

Figure 4

SCSO-ESN-GRU-MC model construction process.

Close modal

The pertinent details regarding Changshui and Lanxi stations are introduced in this section, along with the steps involved in applying the SCSO-ESN-GRU-MC model to the two experimental stations.

Overview of the research area

This paper tests the suggested model using annual runoff datasets from China's Changshui and Lanxi stations to confirm its performance. The Luohe River rises in Muchagou in Lantian County, Shaanxi Province, at the southeast base of Mount Hua in the East Qinling Mountains, and flows as a significant tributary of the Yellow River from Sanmenxia to Huayuankou. It has a total length of 447 km and a drainage area of 18,881 km2, and it empties into the Yellow River north of Luokou in Gongyi City. The Luohe River Basin belongs to a warm temperate monsoon climate, with an average annual temperature of 12.1–14.5°C and an annual precipitation of 500–900 m. The Changshui station is at 111°26′E and 34°19′N in Liupo Village, Changshui Town, Luoning County, Henan Province. With an average annual rainfall of 530 mm and an average annual runoff of 8.17 billion m3, its watershed area is 874 km2. Changshui station has long been responsible for providing real-time water information and collecting hydrological data about the Luohe River to flood control command departments. It is a basic hydrological station in China. Located in the center of Heilongjiang Province, the Hulan River is a first-class tributary on the left bank of the Songhua River. It originates from the Taiping Mountains of the Xiaoxing'an Mountains in Tieli City. It is formed by the confluence of the Kayin and Numin rivers, ultimately flowing into the Songhua River. Its drainage area is 35,683 km2, and its total length is 523 km. With an average annual temperature of 4.0 °C, the Hulan River basin is located in the northern temperate zone's continental monsoon climate. The average annual runoff of the entire basin is 4.129 billion m3, with an average annual runoff depth of 133.3 mm and an average annual estuary flow of 93.75 m3/s. The precipitation season is mainly concentrated in July, August, and September, with 3 months accounting for 70% of the annual precipitation. Lanxi station is located 103 km downstream of the Hulan River in Suihua City, Heilongjiang Province, with a geographical location of 126°20′E and 46°15′N. It controls a drainage area of 2,770 km2. The geographical locations of Changshui station and Lanxi station are shown in Figure 5. Lanxi station is located in a semi-arid region, whereas Changshui station is located in a semi-humid region. The two hydrological stations have significant differences in terrain and landforms, and the areas of Changshui station and Lanxi station are affected by the monsoon climate, with significant differences in precipitation and drastic changes in runoff. They are typical representative hydrological stations. To confirm the effectiveness and generalizability of the suggested model, the runoff and associated meteorological data from the two hydrological stations mentioned above are chosen.
Figure 5

Geographical location information of Changshui station and Lanxi station.

Figure 5

Geographical location information of Changshui station and Lanxi station.

Close modal

This study collected annual runoff data and 19 related factors affecting runoff at the Changshui station from 1960 to 2016 and the Lanxi station from 1956 to 2014. The 19 collected factors are extreme wind speed (EWS), minimum atmospheric pressure (APmin), minimum air temperature (ATmin), maximum atmospheric pressure (APmax), maximum air temperature (ATmax), average atmospheric pressure (AAP), average 2-min wind speed evapotranspiration (Ev), (A2WS), average air temperature (AAT), average water vapor pressure (AWVP), average relative humidity (ARH), average minimum air temperature (AATmin), average maximum air temperature (AATmax), daily precision ≥0.1 mm days (DP0.1), sunshine duration (SD), maximum wind speed (MWS), maximum daily precision (MDP), precision (P), and runoff (R). Table 1 displays pertinent data regarding Changshui station's and Lanxi station's annual runoff data. To confirm the effectiveness of the suggested model, the annual runoff data was chosen from Changshui station for 57 years and Lanxi station for 56 years. The maximum annual runoff of Changshui Station and Lanxi Station are 37.6160 × 108 and 274.3226 mm, respectively.

Table 1

Information on the annual runoff of Changshui station and Lanxi station

Hydrologic stationMaxMinMeanStdTraining sampleTesting sample
Changshui station 37.6160 × 108 m3 1.3200 × 108 m3 8.2172 × 108 m3 6.2005 46 11 
Lanxi station 274.3226 mm 17.4719 mm 120.4795 mm 66.9498 45 11 
Hydrologic stationMaxMinMeanStdTraining sampleTesting sample
Changshui station 37.6160 × 108 m3 1.3200 × 108 m3 8.2172 × 108 m3 6.2005 46 11 
Lanxi station 274.3226 mm 17.4719 mm 120.4795 mm 66.9498 45 11 

Correlation analysis of model inputs

Due to significant differences in the correlation between the 19 factors obtained and annual runoff, using factors with low correlation as inputs for prediction can lead to poor prediction performance. Therefore, conducting a correlation analysis between 19 factors and runoff is necessary to select the factors with a high degree of correlation with runoff factors as inputs to the model. To quantify the linear correlation between annual runoff and 19 factors, this paper chooses the Pearson correlation coefficient and further establishes the input variables for the model. Equation (19) displays the equation for computing the Pearson correlation coefficient. The Pearson correlation coefficient has both positive and negative values, and the values are between [−1, 1], indicating a positive correlation (R > 0) and a negative correlation (R < 0). To determine the strength of the correlation, one typically uses the correlation coefficient R's absolute value. The larger the |R|, the closer it is to 1, indicating that the corresponding potential impact factors correlate better with the predicted object. On the contrary, the smaller the |R|, the closer it is to 0, indicating that the correlation between the corresponding potential impact factors and the predicted object is poorer. This paper selects factors with and believes that these factors correlate significantly with runoff factors. The correlation analysis results between the factors of Changshui station and Lanxi station are shown in Figures 6 and 7.
Figure 6

Heat map for correlation analysis of 19 factors at the Changshui station.

Figure 6

Heat map for correlation analysis of 19 factors at the Changshui station.

Close modal
Figure 7

Heat map for correlation analysis of 19 factors at the Lanxi station.

Figure 7

Heat map for correlation analysis of 19 factors at the Lanxi station.

Close modal

From Figure 6, it can be seen that among the 19 factors at the Changshui station, the factors include APmin, AAP, A2WS, AAT, ARH, AATmin, AATmax, DP0.1, MWS, and P, with 10 factors. As a result, the Changshui station's model validation used these 10 factors as inputs. The content shown in Figure 7 shows that among the 19 factors at the Lanxi station, the factors include ARH, AATmax, DP0.1, MSP, SD, P, and Ev. Due to the significant impact of AAP on precipitation and evaporation processes and patterns, which in turn have a significant impact on runoff depth, and the |R| value of AAP is 0.29, which is very close to 0.3, we have listed AAP as one of the factors affecting runoff at the Lanxi station. Therefore, the input factors selected for model validation at the Lanxi station are the above eight factors. Detailed information on the 10 factors inputted at the Changshui station and the eight factors inputted at the Lanxi station is shown in Appendix Table A1.

Error correction process

The annual runoff series is very volatile, so the predictions made by a single prediction model are not accurate enough. As a result, to increase prediction accuracy and reduce uncertainty in the annual runoff prediction model, the MC is employed for error correction.

First, the initial prediction results of the SCSO-ESN-GRU model's training set are compared with the original annual runoff to obtain an error sequence. Second, compute the relative error absolute value of the error sequence. The probability distribution and scatter distribution of the absolute value of the relative error are then used to divide the error state intervals. The probability distribution diagrams and interval scatter distribution diagrams of Changshui and Lanxi stations are shown in Figures 8 and 9. According to Figure 8(a), it can be seen that the MC state intervals of the training period prediction results of the SCSO-ESN-GRU model at the Changshui station are divided into four states: [0, 10%), [10%, 50%), [50%, 100%), and [100%, 135%). From Figure 9(a), it can be seen that the MC state intervals of the training period prediction results of the SCSO-ESN-GRU model at the Lanxi station are divided into [0, 5%), [5%, 35%), [35%, 100%), and [100%, 340%) states. Then, the absolute values of relative errors are counted based on the state intervals divided by two test stations. The Markov state transition probability matrix is obtained through Equations (14) and (15) as follows:
Figure 8

Probability distribution diagram and interval scatter distribution diagram of the Changshui station.

Figure 8

Probability distribution diagram and interval scatter distribution diagram of the Changshui station.

Close modal
Figure 9

Probability distribution diagram and interval scatter distribution diagram of the Lanxi station.

Figure 9

Probability distribution diagram and interval scatter distribution diagram of the Lanxi station.

Close modal

The above matrices and represent the state transition probability matrices of the SCSO-ESN-GRU model at the Changshui and Lanxi stations. The sum of probabilities of the predicted model's states in each sample space can be computed using the and matrices. The state with the maximum sum of state probabilities is where the error of the model's runoff prediction value lies. According to and matrices, the error state intervals of the SCSO-ESN-GRU model at Changshui and Lanxi stations are [10%, 50%) and [5%, 35%), respectively. Ultimately, the final prediction results following MC correction are obtained by applying Equations (16) and (17) to the initial annual runoff prediction results of the SCSO-ESN-GRU model.

Model development

The experimental process of all models in this paper is conducted on Windows 10 (64-bit), Intel (R) Core (TM) i7-8700 [email protected] on GHz devices using MATLAB 2022b.

This paper selects a total of 14 models, including SCSO-ESN-MC (M 2), SCSO-GRU-MC (M 3), SCSO-ESN-GRU (M 4), SCSO-GRU (M 5), GWO-GRU (M 6), whale optimization algorithm (WOA)-GRU (M 7), SCSO-ESN (M 8), GWO-ESN (M 9), WOA-ESN (M 10), GRU (M 11), ESN (M 12), BP (M 13), LSSVM (M 14), and ELM (M 15), to use annual runoff data from Changshui and Lanxi stations with the SCSO-ESN-GRU-MC (M 1) model. Comparative experiments are carried out to illustrate the suggested model's superior performance. The model and 14 comparative models are referred to as M 1–15 to make representation in the text easier because of the abundance of benchmarking models used in this paper.

The comparative model's prediction procedure looks like this:

  • (1) Single prediction model

The original annual runoff serves as the input object for the ESN model, and the model's input variables are chosen by experimental findings from two experimental stations in Section 3.2. To get the final prediction results, the training set and prediction set are split. GRU, BP, LSSVM, and ELM's prediction process is similar to ESN's. To demonstrate the benefits of combining multiple prediction models and the drawbacks of a single prediction model in annual runoff prediction, five single prediction models, ESN, GRU, BP, LSSVM, and ELM, are chosen to compare with this model.

  • (2) Optimized prediction model

For the SCSO-ESN and SCSO-GRU models, the input object is the original annual runoff. The model's input variables refer to Section 3.2, and the training and prediction sets are divided. The optimal parameter values are searched using the SCSO optimization algorithm and assigned to the ESN network and GRU. Subsequently, the ESN network and GRU output the prediction results, respectively. The prediction processes of GWO-GRU, WOA-GRU, GWO-ESN, and WOA-ESN are similar to those of SCSO-optimized ESN and GRU, except for replacing algorithms with different performances. By using the SCSO-ESN, SCSO-GRU, GWO-GRU, WOA-GRU, GWO-ESN, and WOA-ESN models, the optimization performance benefits of the SCSO algorithm are highlighted, along with the speed and accuracy with which optimization algorithms screen prediction model parameters. Except for the absence of an MC model for error correction, the steps of the SCSO-ESN-GRU model are comparable to the model put forth in this paper. This model is selected to highlight the benefits of combining least squares prediction models with other prediction models, as well as the improvement effect of this coupled on prediction results accuracy.

  • (3) Error correction model

The following are the precise steps of the SCSO-ESN-MC and SCSO-GRU-MC models: To get the first prediction results, feed the initial annual runoff into the SCSO-ESN and SCSO-GRU models. Divide the relative error absolute value of the initial prediction result into state intervals, obtain the state interval according to Figures 8(b), 8(c), 9(b), and 9(c), and then obtain the state transition probability matrix. Finally, obtain the final prediction result using Equations (16) and (17). The state intervals and probability transition matrix results of the SCSO-ESN-MC and SCSO-GRU-MC models for MC correction at two stations are shown in Appendix Table A2. The above two models are selected to demonstrate the effectiveness of MC correction methods in different models and datasets. To fairly compare the prediction processes of 15 models at two stations, the parameter settings of all models are presented in Appendix Table A2.

This section includes the evaluation results of five evaluation indicators, the discussion and analysis of all models, and the prediction results of all models at two experimental stations.

Analysis of prediction results

Figures 10 and 11 display the prediction outcomes of the SCSO-ESN-GRU-MC model at two experimental stations and the predictions of 14 benchmarking models. Since this paper compares up to 14 models to more intuitively present the difference between the anticipated outcomes of different models and the actual values, this paper splits 15 models into four groups and displays them in Figures 10 and 11(a)–11(d), respectively. Because the prediction results during the testing period indicate the model's predictive performance, this paper highlights the results during the testing period. As shown in Figures 10 and 11, (1) the initial annual runoff data from the two experimental stations is unevenly distributed and highly volatile, indicating that it is typically nonlinear and nonstationary data. The Changshui station data present a significant challenge to the prediction model's predictive performance due to the significant difference between the peak and valley values. These 15 prediction models can capture annual runoff information, as evidenced by the fact that all of them can approximate the trend of the original annual runoff in their prediction results. (2) The prediction results of a single prediction model are poor. The trend of the projected outcomes from each of the five individual prediction models is in line with the initial annual runoff, as seen in Figure 11(d). Still, the predicted values of many points significantly deviate from the original values. However, as Figure 10(d) demonstrates, the five single prediction models' predicted outcomes for the Changshui station are not perfect, suggesting that there are still a lot of issues with the single prediction model's ability to handle the relationship between extremely volatile data. (3) When various optimization algorithms are combined with different models, the prediction results vary significantly. Figure 10(c) shows that the optimized prediction model's prediction results outperform the single prediction model's results in Figure 10(d), suggesting that the optimization algorithm can uncover and enhance the prediction model's benefits. Compared to the GWO-ESN and WOA-ESN models, the prediction results of the SCSO-ESN model are superior. The SCSO algorithm's strong global search ability in optimizing the prediction model parameters is reflected in the superior prediction results of the SCSO-GRU model compared to the GWO-GRU and WOA-GRU models. (4) In predicting annual runoff, the coupled prediction model has produced good prediction results. The SCSO-ESN-GRU model outperforms the SCSO-ESN and SCSO-GRU models in terms of peak and valley value prediction, as seen in Figures 10(b) and 11(b). This indicates that the SCSO-ESN-GRU model leverages the advantages of both ESN and GRU models, reducing the shortcomings of a single prediction model. (5) Models that employ the MC error correction method yield prediction results that are more in line with the original data; the model that produced the best predictions is the SCSO-ESN-GRU-MC model. The prediction results of the model with MC error correction are substantially better than those without error correction, as shown in Figures 10(a) and 11(a). The model with error correction is also more accurate in peak value prediction. This indicates that the MC correction method can significantly reduce errors and improve prediction accuracy.
Figure 10

Prediction results of model 15 at the Changshui station.

Figure 10

Prediction results of model 15 at the Changshui station.

Close modal
Figure 11

Prediction results of model 15 at the Lanxi station.

Figure 11

Prediction results of model 15 at the Lanxi station.

Close modal

At the same time, this article uses the Wilcoxon signed rank test to test the significance of the model's prediction results with the original runoff to more clearly demonstrate the predictive performance of the model. Table 2 presents the Wilcoxon test results for different models and original runoff, all of which are based on a significance level of 95% (α = 0.05). The H column displays the result of rejecting the null hypothesis, where 1 represents rejection (i.e., believing there is a significant difference between the two sets of data) and 0 represents inability to reject (i.e., there is no significant difference between the two sets of data). From Table 2, it can be seen that the SCSO-ESN-GRU-MC model is 0 in the H column of both Lanxi Station and Changshui Station, indicating that there is no significant difference between the predicted results and the original runoff. This also indicates that the SCSO-ESN-GRU-MC model has a good fitting effect and strong generalization ability.

Table 2

Wilcoxon test results of different models and original runoff from two experimental stations

ModelLanxi station
Changshui station
Train period
Test period
Train period
Test period
p-valueHp-valueHp-valueHp-valueH
ELM 0.8083 0.3203 0.3732 0.3652 
LSSVM 0.7053 0.2061 0.3674 0.0137 
BP 0.0008 0.8984 0.0013 0.1230 
ESN 0.1005 0.5771 0.0049 0.0537 
GRU 0.9865 0.3203 0.4542 0.0137 
WOA-ESN 0.1128 0.6377 0.0040 0.0420 
GWO-ESN 0.0852 0.5771 0.0044 0.1016 
SCSO-ESN 0.0515 0.4648 0.0000 0.3203 
WOA-GRU 0.8611 0.1230 0.8398 0.7646 
GWO-GRU 0.9865 0.1475 0.7722 0.7002 
SCSO- GRU 0.9415 0.2402 0.8313 1.0000 
SCSO-ESN-GRU 0.3233 0.3203 0.0384 0.7646 
SCSO- GRU-MC 0.3695 0.7002 0.3502 0.6377 
SCSO-ESN-MC 0.0081 0.3203 0.0007 0.3203 
SCSO-ESN-GRU-MC 0.0736 0.6377 0.1603 0.2402 
ModelLanxi station
Changshui station
Train period
Test period
Train period
Test period
p-valueHp-valueHp-valueHp-valueH
ELM 0.8083 0.3203 0.3732 0.3652 
LSSVM 0.7053 0.2061 0.3674 0.0137 
BP 0.0008 0.8984 0.0013 0.1230 
ESN 0.1005 0.5771 0.0049 0.0537 
GRU 0.9865 0.3203 0.4542 0.0137 
WOA-ESN 0.1128 0.6377 0.0040 0.0420 
GWO-ESN 0.0852 0.5771 0.0044 0.1016 
SCSO-ESN 0.0515 0.4648 0.0000 0.3203 
WOA-GRU 0.8611 0.1230 0.8398 0.7646 
GWO-GRU 0.9865 0.1475 0.7722 0.7002 
SCSO- GRU 0.9415 0.2402 0.8313 1.0000 
SCSO-ESN-GRU 0.3233 0.3203 0.0384 0.7646 
SCSO- GRU-MC 0.3695 0.7002 0.3502 0.6377 
SCSO-ESN-MC 0.0081 0.3203 0.0007 0.3203 
SCSO-ESN-GRU-MC 0.0736 0.6377 0.1603 0.2402 

Analysis of evaluation index results

This paper evaluates the prediction accuracy of 15 models using five evaluation indicators: MAE, RMSE, MAPE, NSEC, and R. Appendix Table A3 displays the specific values. The predictive performance of the model can be better explained and illustrated by these five evaluation indicators. Appendix Table A3 analysis reveals the following:

  • (1) Compared to other coupled models, the prediction accuracy of a single model is lower. The test set of the Lanxi station in Appendix Table A3 shows that the NSEC and R-values of the five single prediction models are lower than those of other coupled models, while the MAE, RMSE, and MAPE values are relatively high. This suggests that in time series with high volatility, the prediction outcomes of a single prediction model have large errors and are challenging to evaluate. The five single prediction models' NSEC values are all negative, with R-values not going above 0.5, according to the Changshui station test set. This suggests that the five individual prediction models' prediction results have a low prediction effect and a weak correlation with the original data. Simultaneously, it illustrates how flawed the single prediction model is in capturing the volatile annual runoff data information. The ESN and GRU models have the potential to become more integrated models, as evidenced by the fact that their prediction results at two experimental stations outperformed those of the BP, LSSVM, and ELM models.

  • (2) Compared to a single prediction model, the optimized prediction model has a higher prediction accuracy. For instance, the NSEC values of the SCSO-GRU, GWO-GRU, and WOA-GRU models all exceeded 0, and the R-values all exceeded 0.5 during the Changshui station testing phase. Compared with the GRU model, the RMSE value decreases by 32.95, 27.40, and 23.61%, respectively. Compared with the ESN model, the MAPE value reduces by 53.30, 42.49, and 36.02%, respectively. Compared with the BP model, the MAE value decreases by 43.59, 34.58, and 29.93%, respectively. Compared with the LSSVM model, the NSEC value increases by 148.10, 132.61, and 121.29%, respectively. Compared with the ELM model, the R-value increase by 24.32, 9.66, and 10.89%, respectively. At the Lanxi station, for the GRU, ESN, BP, LSSVM, and ELM models, the RMSE values of the SCSO-ESN model decrease by 13.28, 14.09, 16.30, 18.06, and 25.07%, respectively. The R-values of the GWO-ESN model increase by 4.66, 2.81, 6.80, 4.31, and 3.19%, respectively. The NSEC values of the WOA-ESN model increase by 2.58, 3.44, 5.99, 8.26, and 20.40%, respectively. It is evident that applying optimization algorithms to parameter optimization speeds up the process of determining parameters and greatly increases prediction accuracy.

  • (3) There are notable differences between the prediction outcomes that arise from utilizing distinct optimization algorithms to optimize the parameters of the prediction model. For the GWO-ESN model at the Changshui station, the NSEC value of the SCSO-ESN model increases by 124.42%, while the RMSE and MAE values decrease by 15.31 and 15.35%, respectively. The SCSO-ESN model's RMSE and MAPE values at the Lanxi station drop by 4.01 and 1.92%, respectively. For the WOA-ESN model, at the Changshui station, the R-value of the SCSO-ESN model increases by 7.39%, while the MAE and MAPE values decrease by 17.70 and 20.19%, respectively. The SCSO-ESN model's R and NSEC values increase by 2.12 and 8.00%, respectively, at the Lanxi station. In contrast to the WOA-GRU model, the SCSO-GRU model at the Changshui and Lanxi stations have RMSE values that are 12.23 and 7.53% lower, an R-value that is 12.14% higher and 1.11% lower, an MAE value that is 19.50% lower and 9.67% lower, and an NSEC value that is 125.94% higher and 4.99% lower, respectively. This paper presents the fitness curves of the three algorithms in Figure 12 to compare the optimization performance of the three algorithms in various prediction models and experimental stations more effectively. Figure 12 shows that the SCSO algorithm's fitness value is lowest in the parameter optimization of both GRU and ESN, as well as in the experimental results at the Changshui and Lanxi stations. Combining Appendix Table A3 and Figure 12, it is shown that the SCSO algorithm performs a parameter search by simulating the hunting behavior of felines. Compared with the WOA algorithm, simulating the feeding behavior of whales, the GWO algorithm, simulating the collaborative predation behavior of wolf packs, has a faster convergence speed, stronger global search ability, and strong adaptability. This also enables the SCSO algorithm to achieve better prediction results in ESN and GRU parameter optimization.

  • (4) Compared to a single optimized prediction model, the coupled prediction model achieves a higher accuracy. For the Changshui station, the R-value of the SCSO-ESN-GRU-MC model is 0.9097 and the MAE value is 0.6907. Compared with the SCSO-ESN-MC and SCSO-GRU-MC models, the R-value increases by 10.11 and 3.81%, respectively, while the MAE value decreases by 19.32 and 25.99%, respectively; The RMSE and MAPE of the SCSO-ESN-GRU model decrease by 20.16, and 33.03%, and R increases by 17.82% compared to the SCSO-ESN model, respectively. Compared to the SCSO-GRU model, NSEC and R increase by 17.19 and 5.41%, respectively, while RMSE decreases by 4.70%. At the Lanxi station, compared with the SCSO-ESN-MC model and the SCSO-GRU-MC model, the R and NSEC values of the SCSO-ESN-GRU-MC model increase, while the MAE, RMSE, and MAPE values decrease. Compared with the SCSO-ESN model and the SCSO-GRU model, the RMSE value of the SCSO-ESN-GRU model decreases by 4.05 and 5.91%, respectively, while the NSEC value increases by 2.23 and 3.38%, respectively. As can be seen from the above, the data evaluation index values of the MC corrected LSM and the SCSO-ESN-GRU model after weighting are significantly higher than those of the MC corrected single optimized prediction model and the single optimized prediction model (SCSO-ESN, SCSO-GRU). This suggests that the advantages and disadvantages of the two separate prediction models are balanced by the prediction model combined with the least squares approach. In addition, it successfully reduces the volatility of a single optimized prediction model's prediction results, improving the MC correction's data foundation.

  • (5) The SCSO-ESN-GRU-MC model demonstrates encouraging results in all evaluation indicators, and the prediction results of the coupled model based on MC error correction are superior to other comparative models. For example, at the Changshui station, the R and NSEC values of the SCSO-ESN-GRU-MC model are 0.9097 and 0.8102, respectively, which are much higher than the 0.6454 and 0.4083 values of the SCSO-ESN-GRU model. Compared with the SCSO-ESN model, the NSEC value of the SCSO-ESN-MC model increases from 0.0718 to 0.6750 and the MAE value decreases from 1.8914 to 0.8561. At the Lanxi station, the MAPE and RMSE values of the SCSO-ESN-GRU-MC model decrease by 49.51 and 54.04%, respectively. Compared to the SCSO-ESN-GRU model, the NSEC value increases by 7.01%. The R-value of the SCSO-ESN-MC model is 0.9645, which is much higher than the 0.9039 of the SCSO-ESN model. The RMSE value of the SCSO-GRU-MC model is 23.1424, which is lower than the 32.6657 of the SCSO-GRU model. From this, it can be concluded that the MC can effectively reduce error fluctuations and obtain higher accuracy prediction results due to its adaptability and error correction based on error states.

To further demonstrate the content of Appendix Table A3, this paper presents the results of the MAE, MAPE, NSEC, and RMSE evaluation indicators for the 15 models of Changshui station and Lanxi station in Figures 13 and 14, respectively. Figures 13(a) and 14(a) show the radar image of MAE, where the closer the numerical points are to the radar center, the smaller the MAE value of the model. Figures 13(b) and 14(b) display the bar chart of MAPE, where the lower the column, the smaller the MAPE value of the model. In Figures 13(c) and 14(c), the lollipop chart of NSEC shows that the longer the line of ‘lollipop’ and the larger the circular icon, the higher the NSEC value of the model. In Figures 13(d) and 14(d), the rose plot of RMSE shows that a smaller ‘petal’ indicates a smaller RMSE value for the model. From Figure 13, it can be seen that at the Changshui station, the SCSO-ESN-GRU-MC model shows the smallest results in Figure 13(a), 13(b), and 13(d) compared to the other 14 models, while the ‘lollipop’ in Figure 13(c) is the largest among positive values. This indicates that using the LSM to couple the prediction model and correcting errors using the MC plays a significant role. From Figure 14, it can be seen that at the Lanxi station, in addition to the model proposed in this paper, the SCSO-ESN-MC and SCSO-GRU-MC models perform better than the other 12 models in Figure 14(a)–14(d), indicating that using MC to correct errors is the key to improving prediction accuracy. At the same time, it can also be seen that M 5 and M 8 achieve better evaluation results compared to M 6, M 7, M 9, and M 10, respectively. This indicates that the SCSO algorithm performs better in ESN and GRU parameter optimization than GWO and WOA algorithms.
Figure 12

Fitness curves of SCSO, GWO, and WOA at two experiment stations.

Figure 12

Fitness curves of SCSO, GWO, and WOA at two experiment stations.

Close modal
Figure 13

Schematic diagram of the evaluation index results for the 15 models of the Changshui station.

Figure 13

Schematic diagram of the evaluation index results for the 15 models of the Changshui station.

Close modal
Figure 14

Schematic diagram of the evaluation index results for the 15 models of the Lanxi station.

Figure 14

Schematic diagram of the evaluation index results for the 15 models of the Lanxi station.

Close modal

To further analyze the predictive performance of all models, this paper presents scatter density plots of measured and predicted annual runoff during the testing period at two experimental stations, as shown in Appendix Figures A1 and A2. Among them, R2 in Appendix Figures A1 and A2 is the coefficient of certainty, representing the proportion of variation in the dependent variable y that the independent variable x can explain. It can be intuitively seen that as the parameters of the ESN and GRU models are optimized using the SCSO method, and the prediction model is coupled with the LSM and error corrected using the MC, the contour deviation gradually decreases, R2 gradually increases, and the fitting effect gradually improves. The measured and predicted annual runoff values of the SCSO-ESN-GRU-MC model gradually approach. This outcome is in line with Appendix Table A3's model performance findings. It is evident from the comparison above that enhancing prediction accuracy requires the optimization parameters of the SCSO algorithm, the LSM coupled prediction model, and the MC error correction.

Error analysis

This paper plots the prediction error sequences of two experimental stations into cloud rain maps and stacked bar charts, shown in Appendix Figures A3 and A4, to more clearly demonstrate the correction effect of the MC. Appendix Figure A3 shows that (1) a single prediction model's kernel density curve contains several peaks, with peak positions greatly deviating from the 0-scale line, and the dispersed distribution of error data is not concentrated, with many extreme deviation values. This suggests that it is not ideal for a single prediction model to be able to reduce the volatility of error sequences. (2) The peak shapes of the kernel density curves of the SCSO-ESN-GRU-MC, SCSO-ESN-MC, and SCSO-ESN-MC models are steep and narrow. The distribution of scattered points is highly concentrated, and the box plot's median line is substantially near the 0-scale line. These results suggest that the prediction error when employing the MC for error correction is quite low. However, the SCSO-ESN-GRU-MC model has fewer error outliers than the SCSO-ESN-MC and SCSO-GRU-MC models, indicating that coupling the prediction model with the LSM can, to some extent, reduce the impact of errors. Each model's annual prediction error is displayed in Appendix Figure A4 with clarity. Appendix Figure A4 shows that the model suggested in this paper has the lowest prediction error in the majority of years, and the error values fall within an acceptable range. This suggests that the MC is a useful tool for lessening the impact of errors and raising the accuracy of coupled model predictions.

Discussion

To validate the effectiveness of a novel coupled prediction model based on MC correction, this study applies it to two experimental stations located in various watersheds. Although the prediction results of a single prediction model are much lower than those of a coupled model, the ESN and GRU models stand out among the prediction results of other single prediction models due to their stability and efficiency. They also provide a solid basis for subsequent coupled models, as can be seen from the experimental results displayed in Figures 1014, Appendix Figures A1–A4, and Appendix Table A3. However, due to the crucial impact of the parameters of ESN and GRU models on their prediction results, the time-consuming and laborious traditional parameter determination methods, and unsatisfactory results, this paper adopts the SCSO algorithm to optimize the prediction model parameters. In the SCSO algorithm, the strong exploration ability of each cat is utilized to search for new solution spaces, and the balance between exploration and utilization is appropriately adjusted to find a good balance between global search and local optimization, thereby achieving the optimal prediction results of the prediction model. Due to the long period and multiple influencing factors, the annual runoff changes sharply, which greatly tests the prediction model's performance. Although GRU and ESN models can predict annual runoff, the predicted results have different characteristics and deviations. Therefore, coupling the two models using the LSM can obtain diverse model outputs. Such diversity can lower the uncertainty and error of a single model and offer more thorough information and dependable prediction results. The problem of prediction accuracy not reaching the ideal state due to errors caused by internal structural issues in the model has not yet been resolved despite the combined prediction model producing good prediction results. Therefore, this paper adopts an MC model with a state interval structure for error correction. It is evident from the outcomes of the two experimental stations that the MC model's post-correction prediction accuracy has greatly increased. At the same time, to objectively and rationally analyze the predictive performance of the model proposed in this article, we also selected the research results of hydrological stations using the same research area in previous literature. Compared with the ESMD-SE-CEEMDAN-LSTM model in Wang's article (Wang et al. 2021), the MAE, RMSE, and MAPE of the proposed model for Changshui Station are 0.691, 0.955, and 16.184, respectively, which are significantly better than the ESMD-SE-CEEMDAN-LSTM model's 1.703, 1.827, and 61.345. The model proposed in this article has better predictive performance and more accurate prediction results, proving the superiority of the method. As can be seen from the discussion above, the SCSO-optimized coupled prediction model plus MC error correction combine to produce prediction results that are closer to the original values and with satisfactory prediction accuracy.

This study proposes a coupling model, named SCSO-ESN-GRU-MC, based on SCSO algorithm, ESN and GRU models, the least squares coupling method, and MC error correction to address the issue of great difficulty and low prediction accuracy in annual runoff prediction. First, a correlation analysis is conducted on the data of multiple factors related to runoff to determine the input of the prediction model. Second, the SCSO-ESN and SCSO-GRU models are built by optimizing the parameters of ESN and GRU using the SCSO optimization algorithm. Then, the selected factors, such as average relative humidity, daily precision ≥0.1 mm days, and precision, are used as inputs for the SCSO-ESN and SCSO-GRU models for annual runoff prediction. The initial prediction results are then obtained by coupling the SCSO-ESN and SCSO-GRU model prediction results using the LSM. The final result is then obtained by applying MC error correction to the original prediction results. The following important conclusions are drawn through experiments on the annual runoff data of Changshui station and Lanxi station:

  • (1) The prediction model in this study produced good prediction results by utilizing an echo state structure with fixed weights, an ESN network with high adaptability, and a GRU network with an effective gating mechanism and capacity to capture long-term dependencies.

  • (2) The SCSO algorithm can adapt to uncertain factors in data, solve the problem of parameter determination in ESN and GRU models by dynamically searching and adjusting parameters, and improve the robustness and generalization ability of prediction models.

  • (3) By using the LSM to complement the advantages of ESN and GRU models in predicting the original annual runoff series, it is possible to more comprehensively capture different time scales and patterns in the time series and improve the modeling ability.

  • (4) Experimental results have shown that using a MC with a unique state interval structure for error correction achieves the best results in all evaluation indicators, indicating that it can effectively reduce errors and significantly improve the prediction accuracy of the coupled model.

This study highlights that the accuracy of predicting annual runoff can be greatly increased by employing a coupled neural network method based on MC correction. The research results demonstrate that this coupled model fully utilizes the searchability of optimization algorithms, the generalization characteristics of neural network models, and the correction ability of MC correction errors. This comprehensive strategy has practicality and efficiency and provides a new approach for multifactor input annual runoff prediction models.

The authors are grateful for the support of the special project for collaborative innovation of science and technology in 2021 (No: 202121206).

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Adnan
R. M.
,
Yuan
X.
,
Kisi
O.
,
Adnan
M.
&
Mehmood
A.
2018
Stream flow forecasting of poorly gauged mountainous watershed by least square support vector machine, fuzzy genetic algorithm and M5 model tree using climatic data from nearby station
.
Water Resources Management
32
(
14
),
4469
4486
.
https://doi.org/10.1007/s11269-018-2033-2
.
Adnan
R. M.
,
Mostafa
R. R.
,
Kisi
O.
,
Yaseen
Z. M.
,
Shahid
S.
&
Zounemat-Kermani
M.
2021
Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization
.
Knowledge-Based Systems
230
,
107379
.
https://doi.org/10.1016/j.knosys.2021.107379
.
Bhattacharjee
N. V.
&
Tollner
E. W.
2016
Improving management of windrow composting systems by modeling runoff water quality dynamics using recurrent neural network
.
Ecological Modelling
339
,
68
76
.
https://doi.org/10.1016/j.ecolmodel.2016.08.011
.
Bian
L.
,
Qin
X.
,
Zhang
C.
,
Guo
P.
&
Wu
H.
2023
Application, interpretability and prediction of machine learning method combined with LSTM and LightGBM – a case study for runoff simulation in an arid area
.
Journal of Hydrology
625
,
130091
.
https://doi.org/10.1016/j.jhydrol.2023.130091
.
Chen
L.
,
Wu
T.
,
Wang
Z.
,
Lin
X.
&
Cai
Y.
2023
A novel hybrid BPNN model based on adaptive evolutionary Artificial Bee Colony Algorithm for water quality index prediction
.
Ecological Indicators
146
,
109882
.
https://doi.org/10.1016/j.ecolind.2023.109882
.
Coulibaly
P.
2010
Reservoir computing approach to Great Lakes water level forecasting
.
Journal of Hydrology
381
(
1
),
76
88
.
https://doi.org/10.1016/j.jhydrol.2009.11.027
.
Duan
X.
2023
Research on prediction of slope displacement based on a weighted combination forecasting model
.
Results in Engineering
18
,
101013
.
https://doi.org/10.1016/j.rineng.2023.101013
.
Fang
W.
,
Zhang
S.
&
Xu
C.
2024
Improving prediction efficiency of Chinese stock index futures intraday price by VIX-Lasso-GRU model
.
Expert Systems with Applications
238
,
121968
.
https://doi.org/10.1016/j.eswa.2023.121968
.
Gao
S.
,
Huang
Y.
,
Zhang
S.
,
Han
J.
,
Wang
G.
,
Zhang
M.
&
Lin
Q.
2020
Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation
.
Journal of Hydrology
589
,
125188
.
https://doi.org/10.1016/j.jhydrol.2020.125188
.
Gao
Y.
,
Hang
Y.
&
Yang
M.
2021
A cooling load prediction method using improved CEEMDAN and Markov Chains correction
.
Journal of Building Engineering
42
,
103041
.
https://doi.org/10.1016/j.jobe.2021.103041
.
Gao
S.
,
Zhang
S.
,
Huang
Y.
,
Han
J.
,
Luo
H.
,
Zhang
Y.
&
Wang
G.
2022
A new seq2seq architecture for hourly runoff prediction using historical rainfall and runoff as input
.
Journal of Hydrology
612
,
128099
.
https://doi.org/10.1016/j.jhydrol.2022.128099
.
Han
H.
&
Morrison
R. R.
2022a
Data-driven approaches for runoff prediction using distributed data
.
Stochastic Environmental Research and Risk Assessment
36
(
8
),
2153
2171
.
https://doi.org/10.1007/s00477-021-01993-3
.
Han
H.
&
Morrison
R. R.
2022b
Improved runoff forecasting performance through error predictions using a deep-learning approach
.
Journal of Hydrology
608
,
127653
.
https://doi.org/10.1016/j.jhydrol.2022.127653
.
He
C.
,
Chen
F.
,
Long
A.
,
Qian
Y.
&
Tang
H.
2023
Improving the precision of monthly runoff prediction using the combined non-stationary methods in an oasis irrigation area
.
Agricultural Water Management
279
,
108161
.
https://doi.org/10.1016/j.agwat.2023.108161
.
Jahandideh-Tehrani
M.
,
Jenkins
G.
&
Helfer
F.
2021
A comparison of particle swarm optimization and genetic algorithm for daily rainfall-runoff modelling: A case study for Southeast Queensland, Australia
.
Optimization and Engineering
22
(
1
),
29
50
.
https://doi.org/10.1007/s11081-020-09538-3
.
Jia
S.
,
Kang
X.
,
Cui
J.
,
Tian
B.
&
Xiao
S.
2022
Hierarchical stochastic optimal scheduling of electric thermal hydrogen integrated energy system considering electric vehicles
.
Energies
https://doi.org/10.3390/en15155509
.
Jiang
P.
&
Chen
J.
2016
Displacement prediction of landslide based on generalized regression neural networks with K-fold cross-validation
.
Neurocomputing
198
,
40
47
.
https://doi.org/10.1016/j.neucom.2015.08.118
.
Li
F.
&
Li
Y.
2023
Robust echo state network with Cauchy loss function and hybrid regularization for noisy time series prediction
.
Applied Soft Computing
146
,
110640
.
https://doi.org/10.1016/j.asoc.2023.110640
.
Li
N.
,
Li
L.
,
Huang
F.
,
Liu
X.
&
Wang
S.
2023
Photovoltaic power prediction method for zero energy consumption buildings based on multi-feature fuzzy clustering and MAOA-ESN
.
Journal of Building Engineering
75
,
106922
.
https://doi.org/10.1016/j.jobe.2023.106922
.
Liu
H.
,
Tian
H.-q.
&
Li
Y.-f.
2012
Comparison of two new ARIMA-ANN and ARIMA-Kalman hybrid methods for wind speed prediction
.
Applied Energy
98
,
415
424
.
https://doi.org/10.1016/j.apenergy.2012.04.001
.
Liu
J.
,
Meng
X.
,
Ma
Y.
&
Liu
X.
2020
Introduce canopy temperature to evaluate actual evapotranspiration of green peppers using optimized ENN models
.
Journal of Hydrology
590
,
125437
.
https://doi.org/10.1016/j.jhydrol.2020.125437
.
Liu
Y.
,
Ji
Y.
,
Liu
D.
,
Fu
Q.
,
Li
T.
,
Hou
R.
,
Li
Q.
,
Cui
S.
&
Li
M.
2021
A new method for runoff prediction error correction based on LS-SVM and a 4D copula joint distribution
.
Journal of Hydrology
598
,
126223
.
https://doi.org/10.1016/j.jhydrol.2021.126223
.
Liu
J.
,
Chen
L.
,
Xu
W.
,
Feng
M.
,
Han
Y.
,
Xia
T.
&
Geng
Z.
2023
Novel production prediction model of gasoline production processes for energy saving and economic increasing based on AM-GRU integrating the UMAP algorithm
.
Energy
262
,
125536
.
https://doi.org/10.1016/j.energy.2022.125536
.
Meng
D.
,
Chen
X.
,
Wei
Z.
&
Ji
S.
2023
Prediction of flexural and uniaxial compressive strengths of sea ice with optimized recurrent neural network
.
Ocean Engineering
288
,
115921
.
https://doi.org/10.1016/j.oceaneng.2023.115921
.
Min
X.
,
Hao
B.
,
Sheng
Y.
,
Huang
Y.
&
Qin
J.
2023
Transfer performance of gated recurrent unit model for runoff prediction based on the comprehensive spatiotemporal similarity of catchments
.
Journal of Environmental Management
330
,
117182
.
https://doi.org/10.1016/j.jenvman.2022.117182
.
Niu
W.-j.
,
Feng
Z.-k.
,
Zeng
M.
,
Feng
B.-f.
,
Min
Y.-w.
,
Cheng
C.-t.
&
Zhou
J.-z.
2019
Forecasting reservoir monthly runoff via ensemble empirical mode decomposition and extreme learning machine optimized by an improved gravitational search algorithm
.
Applied Soft Computing
82
,
105589
.
https://doi.org/10.1016/j.asoc.2019.105589
.
Parisouj
P.
,
Mohebzadeh
H.
&
Lee
T.
2020
Employing machine learning algorithms for streamflow prediction: A case study of four river basins with different climatic zones in the United States
.
Water Resources Management
34
(
13
),
4113
4131
.
https://doi.org/10.1007/s11269-020-02659-5
.
Partington
D.
,
Brunner
P.
,
Simmons
C. T.
,
Werner
A. D.
,
Therrien
R.
,
Maier
H. R.
&
Dandy
G. C.
2012
Evaluation of outputs from automated baseflow separation methods against simulated baseflow from a physically based, surface water-groundwater flow model
.
Journal of Hydrology
458–459
,
28
39
.
https://doi.org/10.1016/j.jhydrol.2012.06.029
.
Qiao
X.
,
Peng
T.
,
Sun
N.
,
Zhang
C.
,
Liu
Q.
,
Zhang
Y.
,
Wang
Y.
&
Shahzad Nazir
M.
2023
Metaheuristic evolutionary deep learning model based on temporal convolutional network, improved Aquila optimizer and random forest for rainfall-runoff simulation and multi-step runoff prediction
.
Expert Systems with Applications
229
,
120616
.
https://doi.org/10.1016/j.eswa.2023.120616
.
Qin
Y.
,
Li
B.
,
Sun
X.
,
Chen
Y.
&
Shi
X.
2019
Nonlinear response of runoff to atmospheric freezing level height variation based on hybrid prediction models
.
Hydrological Sciences Journal
64
(
13
),
1556
1572
.
https://doi.org/10.1080/02626667.2019.1662023
.
Ribeiro
V. H. A.
,
Reynoso-Meza
G.
&
Siqueira
H. V.
2020
Multi-objective ensembles of echo state networks and extreme learning machines for streamflow series forecasting
.
Engineering Applications of Artificial Intelligence
95
,
103910
.
https://doi.org/10.1016/j.engappai.2020.103910
.
Rohani
A.
,
Taki
M.
&
Abdollahpour
M.
2018
A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I)
.
Renewable Energy
115
,
411
422
.
https://doi.org/10.1016/j.renene.2017.08.061
.
Seyyedabbasi
A.
&
Kiani
F.
2023
Sand cat swarm optimization: A nature-inspired algorithm to solve global optimization problems
.
Engineering with Computers
39
(
4
),
2627
2651
.
https://doi.org/10.1007/s00366-022-01604-x
.
Song
C.
,
Chen
X.
,
Wu
P.
&
Jin
H.
2021
Combining time varying filtering based empirical mode decomposition and machine learning to predict precipitation from nonlinear series
.
Journal of Hydrology
603
,
126914
.
https://doi.org/10.1016/j.jhydrol.2021.126914
.
Tavakol Aghaei
V.
,
SeyyedAbbasi
A.
,
Rasheed
J.
&
Abu-Mahfouz
A. M.
2023
Sand cat swarm optimization-based feedback controller design for nonlinear systems
.
Heliyon
9
(
3
),
e13885
.
https://doi.org/10.1016/j.heliyon.2023.e13885
.
Tennant
C.
,
Larsen
L.
,
Bellugi
D.
,
Moges
E.
,
Zhang
L.
&
Ma
H.
2020
The utility of information flow in formulating discharge forecast models: A case study from an arid snow-dominated catchment
.
Water Resources Research
56
(
8
),
e2019WR024908
.
https://doi.org/10.1029/2019WR024908
.
Wang
W.-C.
,
Chau
K.-W.
,
Cheng
C.-T.
&
Qiu
L.
2009
A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series
.
Journal of Hydrology
374
(
3
),
294
306
.
https://doi.org/10.1016/j.jhydrol.2009.06.019
.
Wang
W. C.
,
Xu
D. M.
,
Chau
K. W.
&
Chen
S.
2013
Improved annual rainfall-runoff forecasting using PSO-SVM model based on EEMD
.
Journal of Hydroinformatics
1377
1390
.
https://doi.org/10.2166/hydro.2013.134
.
Wang
W.-c.
,
Du
Y.-j.
,
Chau
K.-w.
,
Xu
D.-m.
,
Liu
C.-j.
&
Ma
Q.
2021
An ensemble hybrid forecasting model for annual runoff based on sample entropy, secondary decomposition, and long short-term memory neural network
.
Water Resources Management
35
(
14
),
4695
4726
.
https://doi.org/10.1007/s11269-021-02920-5
.
Wang
W.-c.
,
Cheng
Q.
,
Chau
K.-w.
,
Hu
H.
,
Zang
H.-f.
&
Xu
D.-m.
2023
An enhanced monthly runoff time series prediction using extreme learning machine optimized by salp swarm algorithm based on time varying filtering based empirical mode decomposition
.
Journal of Hydrology
620
,
129460
.
https://doi.org/10.1016/j.jhydrol.2023.129460
.
Wang
J.
,
Qian
Y.
,
Zhang
L.
,
Wang
K.
&
Zhang
H.
2024
A novel wind power forecasting system integrating time series refining, nonlinear multi-objective optimized deep learning and linear error correction
.
Energy Conversion and Management
299
,
117818
.
https://doi.org/10.1016/j.enconman.2023.117818
.
Wei
X.
&
Xun
Y.
2019
Evaluation of the effective forecast and decision horizon in optimal hydropower generation considering medium-range precipitation forecasts
.
Water Science and Technology: Water Supply
19
(
7
),
2147
2155
.
https://doi.org/10.2166/ws.2019.095
.
Wei
Z.
,
Han
X.
&
Li
J.
2022
State of health assessment for echelon utilization batteries based on deep neural network learning with error correction
.
Journal of Energy Storage
51
,
104428
.
https://doi.org/10.1016/j.est.2022.104428
.
Wei
X.
,
Wang
G.
,
Schmalz
B.
,
Hagan
D. F. T.
&
Duan
Z.
2023
Evaluation of transformer model and self-attention mechanism in the Yangtze River basin runoff prediction
.
Journal of Hydrology: Regional Studies
47
,
101438
.
https://doi.org/10.1016/j.ejrh.2023.101438
.
Wen
S.
,
Wang
H.
,
Qian
J.
&
Men
X.
2023
A novel combined model based on echo state network optimized by whale optimization algorithm for blast furnace gas prediction
.
Energy
279
,
128048
.
https://doi.org/10.1016/j.energy.2023.128048
.
Wu
J.
,
Dong
J.
,
Wang
Z.
,
Hu
Y.
&
Dou
W.
2023
A novel hybrid model based on deep learning and error correction for crude oil futures prices forecast
.
Resources Policy
83
,
103602
.
https://doi.org/10.1016/j.resourpol.2023.103602
.
Xu
Y.
,
Hu
C.
,
Wu
Q.
,
Jian
S.
,
Li
Z.
,
Chen
Y.
,
Zhang
G.
,
Zhang
Z.
&
Wang
S.
2022
Research on particle swarm optimization in LSTM neural networks for rainfall-runoff simulation
.
Journal of Hydrology
608
,
127553
.
https://doi.org/10.1016/j.jhydrol.2022.127553
.
Xu
D.-m.
,
Hu
X.-x.
,
Wang
W.-c.
,
Chau
K.-w.
&
Zang
H.-f.
2023
An enhanced monthly runoff forecasting using least squares support vector machine based on Harris hawks optimization and secondary decomposition
.
Earth Science Informatics
16
(
3
),
2089
2109
.
https://doi.org/10.1007/s12145-023-01018-3
.
Xu
D.-m.
,
Hu
X.-x.
,
Wang
W.-c.
,
Chau
K.-w.
,
Zang
H.-f.
&
Wang
J.
2024
A new hybrid model for monthly runoff prediction using ELMAN neural network based on decomposition-integration structure with local error correction method
.
Expert Systems with Applications
238
,
121719
.
https://doi.org/10.1016/j.eswa.2023.121719
.
Yang
H.
,
Zhao
J.
&
Li
G.
2023
A novel hybrid prediction model for PM2.5 concentration based on decomposition ensemble and error correction
.
Environmental Science and Pollution Research
30
(
15
),
44893
44913
.
https://doi.org/10.1007/s11356-023-25238-8
.
Yoon
H.
,
Jun
S.-C.
,
Hyun
Y.
,
Bae
G.-O.
&
Lee
K.-K.
2011
A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer
.
Journal of Hydrology
396
(
1
),
128
138
.
https://doi.org/10.1016/j.jhydrol.2010.11.002
.
Zhang
Y.
,
Han
J.
,
Pan
G.
,
Xu
Y.
&
Wang
F.
2021
A multi-stage predicting methodology based on data decomposition and error correction for ultra-short-term wind energy prediction
.
Journal of Cleaner Production
292
,
125981
.
https://doi.org/10.1016/j.jclepro.2021.125981
.
Zhou
T.
,
Yang
X.
,
Ren
H.
,
Li
C.
&
Han
J.
2023
The prediction of ship motion attitude in seaway based on BSO-VMD-GRU combination model
.
Ocean Engineering
288
,
115977
.
https://doi.org/10.1016/j.oceaneng.2023.115977
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data