ABSTRACT
Accurate forecasting of increasingly unpredictable river runoff is essential for effective water resource management in the face of climate change and human activities. This study uses four machine learning models of long short-term memory neural networks (LSTM), support vector machine (SVM), random forest, and artificial neural network models to improve runoff forecasting accuracy and explore combined forecasting models’ effectiveness. This study develops three advanced combined forecasting models (empirical mode decomposition (EMD)–LSTM, VMD–LSTM, wavelet analysis (WA)–LSTM) by combining preprocessing techniques of EMD, variational mode decomposition (VMD), and WA with the LSTM modeling method. These models use signal decomposition techniques to analyze 41 years of runoff data from the Huanren station (1980–2020). The findings reveal that the LSTM model outperforms the other three individual machine learning models when forecasting days with high runoff. Among the three decomposed combined models, the VMD–LSTM model demonstrates the best overall performance during the validation period, achieving root mean square error, Nash–Sutcliffe efficiency coefficient, and bias values of 52.14 m3/s, 0.96, and −0.002, respectively. The combination of LSTM with signal decomposition techniques shows promising potential for enhancing runoff prediction accuracy, with practical implications for water resource management and flood control strategies.
HIGHLIGHTS
LSTM excels in preserving the integrity of rainfall and runoff data, leading to superior performance.
The LSTM model demonstrates a significant advantage in forecasting high-flow days compared to other single models.
The VMD–LSTM model achieves the best overall performance by effectively filtering out noise from the data.
INTRODUCTION
Runoff forecasting is closely related to watershed hydrology and flood management, as it plays a crucial role in understanding and managing water resources within a watershed. Watershed characteristics, such as its size, shape, river, geology, and river network, determine how precipitation is distributed across the watershed and how it is converted into runoff (Abed-Elmdoust et al. 2016; Sarker et al. 2019; Gao et al. 2022; Singhal et al. 2024). With the ability to predict dynamic changes in river runoff, decision-makers can make informed choices about how to distribute and utilize water resources within a basin. This includes ensuring an equitable allocation of water for various purposes, such as agriculture, industry, and domestic use. It also enables them to plan for potential water shortages or excesses and take proactive measures to address these challenges. Furthermore, river runoff forecasting is essential for improving flood prevention and mitigation capabilities (Hu et al. 2020).
With the continuous advancement of artificial intelligence technology, an increasing number of machine learning models have been employed in the field of runoff forecasting. Some commonly used models include long short-term memory neural networks (LSTM), support vector machines (SVMs), random forest (RF) models, and artificial neural networks (ANNs). For instance, Lin & Cheng (2006) introduced a radial basis kernel function in the SVM modeling process and obtained better results than the ANN. Wang et al. (2009) compared the performance of the auto-regressive moving average model, ANN, adaptive network-based fuzzy inference system, genetic programming, and SVM models in predicting monthly runoff time series, with SVM demonstrating superior performance in long-term runoff prediction. Li et al. (2020) developed RF and SVM models for annual, flood, and dry period average incoming runoff in the Longjiang Reservoir in Yunnan and achieved high overall accuracy. However, the models' prediction accuracy at local extreme flows could be improved.
In further research, Lai et al. (2015) constructed a flood risk evaluation model based on the RF intelligent algorithm in the Dongjiang River basin. The model required minimal parameters and did not necessitate the setting of indicator weights and grading criteria, providing a simple implementation process for flood risk evaluation. Mou et al. (2009) focused on the Urumqi River source No. 1 glacier area and developed a feed-forward ANN model for runoff forecasting in the alpine cold region. The model not only yielded better simulation and prediction results but also provided insight into the network structure characteristics for studying glacier ablation runoff patterns. Zhu et al. (2005) tested an ANN model using water level station data located in the Beijiang River, Zhujiang Delta, demonstrating that good forecasting results could be achieved with an appropriate selection of input layer unit data and forecasting periods. Wang et al. (2022) utilized an ant-lion optimization algorithm-based LSTM network to study an earth-rock dam in southwest China, producing predictions that aligned well with real-world engineering. Yin et al. (2019) compared the LSTM model with the Xin'anjiang model for a gauging station in the Jinjiang River basin and found that the LSTM model outperformed the Xin'anjiang model across different forecasting periods. The aforementioned studies showcase the unique advantages of LSTM models in hydrological forecasting. However, most of these studies focus on single models and limited optimization of model parameters, without preprocessing of input data.
Runoff time series are complex and non-linear, making direct prediction challenging (Peng et al. 2017). Therefore, data preprocessing methods based on signal decomposition, such as empirical mode decomposition (EMD), wavelet analysis (WA), and variational mode decomposition (VMD), have been increasingly used in hydrological research. For instance, Napolitano et al. (2011) built an EMD–ANN using monitoring data from a gauging station in Maryland, USA and found that the combined forecasting model had higher accuracy compared to a single ANN model. Tayyab et al. (2019) developed a WA–ANN for monthly flow prediction in the Jinsha River Basin and demonstrated that WA improved the accuracy of flow prediction. Qi et al. (2022) combined VMD with LSTM models to construct a VMD–LSTM forecasting model for monthly runoff in the upper reaches of the Yellow River basin, achieving satisfactory prediction results. Liu & Wang (2022) proposed an EMD–LSTM model based on deep learning, which outperformed traditional LSTM models and showed greater adaptability in flood prevention and mitigation.
While machine learning models have been widely studied in hydrological prediction, their integration with signal decomposition methods, such as EMD, has not been extensively explored in the realm of hydrological forecasting. Therefore, our study aims to address this gap by examining the collective effectiveness of these models and signal decomposition techniques, providing valuational insights into their collaborative potential for enhancing the precision and reliability of hydrological forecasts. Based on the above background, this study investigates the runoff of the Huanren station between 1980 and 2020 using four single machine learning models (LSTM, SVM, RF, and ANN) and three combined models (EMD–LSTM, VMD–LSTM, and WA–LSTM). The comparison and analysis of the different models will shed light on the accuracy and effectiveness of each model in predicting runoff. It is of great significance to the water resource management and flood control strategies in the Hun River basin.
STUDY AREA AND DATA
Study area
The Hun River is the largest right tributary of the Yalu River, originating from the southern foot of the Laogang Mountain Range in the Changbai Mountain System. It flows for a total length of 432 km, passing through the provinces of Jilin and Liaoning in a northeast-to-southwest direction. The section of the river above the Huanren Hydropower Station dam site spans 247 km, covering a controlled basin area of 10,400 km2. Within the watershed, there are 10 rainfall stations: Sanchazi, Badaojiang, Balishao, Tonghua, Sankeshu, Badaogou, Dongcun, Yezhugou, Huadian, and Huanren.
The Hun River basin is situated at the northern edge of the storm center in northeast China. It is characterized by mountainous terrain, steep hills, and abundant vegetation. The region experiences a temperate monsoon climate, with an average annual precipitation of 860 mm and an average runoff coefficient of 0.52. The majority of rainfall, about 70%, is concentrated between June and September, resulting in significant floods typically occurring from late July to mid-August. Due to the basin's undulating topography, the river's steep slope, and the limited storage capacity of the riverbed, heavy rainfall often leads to rapid rises and falls in water levels, with 80% of the total flood occurring within a 3-day period. Winter in the basin usually begins in November and ends in late March or early April, characterized by snowfall. In late March or early April, snow starts to thaw, leading to spring flooding as rivers and streams begin to thaw.
Data collection
Location of the Huanren Hydropower Station basin and rainfall stations.
METHODOLOGY
Forecasting models
Long short-term memory neural network








Support vector machine
The SVM was formally proposed by Vapnik (1995). It is a binary classification model that uses a linear classifier with the largest interval defined in the feature space to solve convex quadratic programming problems. For the basic principles and related formulas of the SVM model, refer Wang et al. (2003). In recent years, the application of SVM in hydrological work has been increasing, and some scholars have successfully applied SVM in groundwater prediction and flood forecasting (Wang et al. 2003; Swastik et al. 2019).
RF model
RF is an integrated learning algorithm proposed by Breiman (2001). It employs a bootstrap resampling method to draw multiple samples from the original data and constructs classification trees for each sampled dataset. The predictions of all classification trees are then combined through voting to obtain the final result. For the basic principles and related formulas of the RF model, refer Lai et al. (2015). RF is known for its robustness to outliers and noise, and it is less prone to overfitting. As a result, RF has been widely applied in various fields, such as genetic engineering, rock projects, and hydraulic engineering (Chen & Ishwaran 2012; Dong et al. 2013; Arash et al. 2023).
Artificial neural networks
ANNs are an information processing system that simulates the structure and function of brain neural networks using mathematical models (Hyo-Jin et al. 2024). For the basic principles and related formulas of the ANN model, see the paper of Zhu et al. (2005). Unlike traditional methods, ANN does not require a predefined model structure or an optimization algorithm for specific model parameters. As a result, it has been widely utilized in solving various practical problems, including river runoff forecasting, rainfall-runoff simulation, and water quality parameter forecasting (Shang et al. 1995; Li et al. 2002).
Decomposition methods
Empirical mode decomposition
EMD is a signal decomposition technique that breaks down a signal into physically meaningful intrinsic mode functions (IMFs) based on specific conditions (Masoud et al. 2023). The calculation steps of EMD are as follows:
① Identify all the local extreme points of the original signal, x(t), and fit its upper envelope, xmax(t), and lower envelope, xmin(t), using the cubic spline interpolation function.
② Calculate the average envelope, m(t), by taking the average of the upper and lower envelopes. Compute the difference sequence, h(t), between the original signal, x(t), and the average envelope: h(t) = x(t) −m(t).
③ Check if h(t) satisfies two conditions:
(a) At any point, the mean value of the upper and lower envelopes defined by the extreme points is zero.
(b) The mean value of h(t) at each instant is zero.
If these conditions are not satisfied, repeat the above steps with h(t) as the new signal until the conditions are met. Once the conditions are satisfied, h(t) becomes the first IMF, I1(t).
④ Subtract I1(t) from the original signal and calculate the residual term, r1(t): r1(t) = x(t) − I1(t). Consider r(t) as the new original signal and repeat the above steps to decompose the remaining IMFs until a monotonic residual term is obtained.
Following these steps, EMD decomposes the original signal into a set of IMFs, which represent different frequency components of the signal.
Variational mode decomposition
VMD was proposed by Dragomiretskiy & Zosso (2014). It aims to decompose a signal into K components by determining the optimal center frequency and finite bandwidth through a search and solution process. This allows for a complete dissection of the signal's frequency domain and the separation of individual components. VMD effectively addresses the issue of mode mixing in EMD when dealing with non-linear and non-stationary signals. Furthermore, it is advantageous in handling noise-sensitive problems because VMD essentially consists of multiple adaptive Wiener filter groups.
















Wavelet analysis
WA is a powerful method that allows for localized analysis of both time and frequency domains, overcoming the limitations of the Fourier transform (Lana et al. 2023). It has been a significant breakthrough in scientific methods. In practical scenarios, signals typically have low and relatively stable frequencies, while noise frequencies are often high. The wavelet transform involves decomposing the original signal using a set of low-pass and high-pass filter banks, which continuously decompose the signal into low-frequency approximate signals (caN) and high-frequency detailed signals (cd1, … , cdN) at different resolutions. This decomposition greatly enhances the local signal information. By reconstructing these low-frequency and high-frequency signals, a denoised signal can be obtained, which improves the noise immunity of the model.





Model evaluation metrics



RMSE is a metric that quantifies the deviation between predicted and observed values. It ranges from [0, +∞], with 0 indicating the best fit of the model. NSE coefficient is used to assess the goodness of fit of hydrological model simulation results. Its value ranges from [−∞, 1], with 1 indicating perfect agreement between the simulated results and the observed data. BIAS is used to assess the discrepancy between the total water volume in the simulation results and the observed values. It takes values in the range of [−∞, +∞], with 0 being the ideal value. A positive value suggests an overall high water volume, while a negative value indicates a low water volume.
RESULTS AND DISCUSSION
Results of input dimension selection
Single model prediction results and analysis
Evaluation indicators for the four single models
Models . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
LSTM | RMSE (m3/s) | 48.71 | 70.11 |
NSE | 0.96 | 0.92 | |
BIAS | 0.13 | 0.14 | |
SVM | RMSE (m3/s) | 72.08 | 72.45 |
NSE | 0.93 | 0.90 | |
BIAS | −0.05 | −0.04 | |
RF | RMSE (m3/s) | 66.60 | 78.47 |
NSE | 0.94 | 0.90 | |
BIAS | −0.03 | −0.06 | |
ANN | RMSE (m3/s) | 71.56 | 75.22 |
NSE | 0.93 | 0.90 | |
BIAS | 0.002 | 0.006 |
Models . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
LSTM | RMSE (m3/s) | 48.71 | 70.11 |
NSE | 0.96 | 0.92 | |
BIAS | 0.13 | 0.14 | |
SVM | RMSE (m3/s) | 72.08 | 72.45 |
NSE | 0.93 | 0.90 | |
BIAS | −0.05 | −0.04 | |
RF | RMSE (m3/s) | 66.60 | 78.47 |
NSE | 0.94 | 0.90 | |
BIAS | −0.03 | −0.06 | |
ANN | RMSE (m3/s) | 71.56 | 75.22 |
NSE | 0.93 | 0.90 | |
BIAS | 0.002 | 0.006 |
During the validation period, the LSTM model also performs best in terms of RMSE and NSE, with values of 70.11 and 0.92 m3/s, respectively. The RMSE and NSE values for SVM, RF, and ANN are 72.45 and 0.90 m3/s, 78.47 and 0.90 m3/s, and 75.22 and 0.90 m3/s, respectively. ANN achieves the highest accuracy in terms of BIAS, with a value of 0.006. The BIAS values for the LSTM, SVM, and RF models are 0.14, −0.04, and −0.06, respectively. Based on the principles of the models, the LSTM model has an advantage in handling long time series data due to its gated structure. It effectively preserves the integrity of rainfall and runoff data, leading to better results in terms of RMSE and NSE during both the training and validation periods. However, the BIAS metric shows slightly lower accuracy for the LSTM model. This could be attributed to factors such as small rainfall and runoff values during the non-flood season, non-stationarity and non-linearity of the runoff sequence, and noise from human activities, which can adversely affect the prediction results and impact the BIAS metric.
In this study, we specifically identify days with daily runoff volumes of 2,000 m³/s and above as high-flow days during the validation period. To evaluate the prediction accuracy of each model on these high-flow days, we introduce the BIAS metric, as shown in Table 2. The results clearly indicate that the LSTM model outperforms the other models in forecasting high-flow days. To further enhance the prediction accuracy of the LSTM model, three signal decomposition methods, namely EMD, VMD, and WA, are introduced in this study. These methods are utilized to preprocess the data for the long runoff series and investigate the prediction accuracy of combined models, namely EMD–LSTM, VMD–LSTM, and WA–LSTM decomposition models.
Daily forecast accuracy of high flows for each model
. | LSTM . | SVM . | RF . | ANN . |
---|---|---|---|---|
BIAS | −0.0266 | −0.1427 | −0.1852 | −0.1562 |
. | LSTM . | SVM . | RF . | ANN . |
---|---|---|---|---|
BIAS | −0.0266 | −0.1427 | −0.1852 | −0.1562 |
Decomposition results
Decomposition results of runoff series based on EMD
Decomposition results of runoff series based on VMD
Center frequencies corresponding to different K-values
Number of modes . | Center frequency (Hz) . | |||||||
---|---|---|---|---|---|---|---|---|
1 | 4 | |||||||
2 | 4 | 92 | ||||||
3 | 3 | 33 | 102 | |||||
4 | 3 | 33 | 101 | 213 | ||||
5 | 2 | 31 | 91 | 140 | 229 | |||
6 | 2 | 31 | 88 | 135 | 202 | 253 | ||
7 | 2 | 30 | 86 | 134 | 200 | 249 | 346 | |
8 | 2 | 24 | 56 | 99 | 140 | 203 | 251 | 349 |
Number of modes . | Center frequency (Hz) . | |||||||
---|---|---|---|---|---|---|---|---|
1 | 4 | |||||||
2 | 4 | 92 | ||||||
3 | 3 | 33 | 102 | |||||
4 | 3 | 33 | 101 | 213 | ||||
5 | 2 | 31 | 91 | 140 | 229 | |||
6 | 2 | 31 | 88 | 135 | 202 | 253 | ||
7 | 2 | 30 | 86 | 134 | 200 | 249 | 346 | |
8 | 2 | 24 | 56 | 99 | 140 | 203 | 251 | 349 |
WA-based decomposition results of runoff sequences
In this study, the selection of the wavelet basis function was a crucial step in the analysis of the data. Among the commonly used orthogonal wavelet bases, options, such as db1, db2, db3, and db4, were considered. After careful evaluation, the db4 wavelet basis function was chosen for several reasons. The db4 wavelet basis function is renowned for its balanced nature, exhibiting good local properties and a relatively smooth frequency response. These characteristics make it suitable for a wide range of signal-processing tasks, including signal denoising and compression. The db4 wavelet basis function has been widely employed and validated in various studies and applications. Considering the comprehensive evaluation of the available options, the db4 wavelet basis function was deemed the most appropriate choice for this study. It offered the potential to effectively capture the desired features and patterns within the runoff and rainfall time series, facilitating accurate analysis and modeling.
Table of accuracy indicators for different decomposition scales
Scale of decomposition . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
J = 1 | RMSE (m3/s) | 54.05 | 54.47 |
NSE | 0.96 | 0.95 | |
BIAS | −0.17 | −0.19 | |
J = 2 | RMSE (m3/s) | 84.22 | 87.38 |
NSE | 0.91 | 0.87 | |
BIAS | 0.59 | 0.65 | |
J = 3 | RMSE (m3/s) | 75.46 | 77.27 |
NSE | 0.93 | 0.90 | |
BIAS | 0.48 | 0.54 |
Scale of decomposition . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
J = 1 | RMSE (m3/s) | 54.05 | 54.47 |
NSE | 0.96 | 0.95 | |
BIAS | −0.17 | −0.19 | |
J = 2 | RMSE (m3/s) | 84.22 | 87.38 |
NSE | 0.91 | 0.87 | |
BIAS | 0.59 | 0.65 | |
J = 3 | RMSE (m3/s) | 75.46 | 77.27 |
NSE | 0.93 | 0.90 | |
BIAS | 0.48 | 0.54 |
WA-based decomposition of runoff sequences: (a) J = 1, (b) J = 2, (c) J = 3.
Based on the aforementioned plots, it is evident that for this study, when employing the db4 wavelet basis function, the optimal decomposition scale is 1. The combined model prediction results in the validation period demonstrate that the RMSE for this scale is 54.47 m3/s, which is 37.66 and 29.51% lower than that of scales J = 2 and J = 3, respectively. Furthermore, the NSE for this scale is 0.95, representing a 9.20 and 5.56% improvement compared to scales J = 2 and J = 3, respectively. The absolute value of the BIAS is 0.19, indicating a reduction of 70.77 and 64.81% in comparison to scales J = 2 and J = 3, respectively.
Results and analysis of the combined forecast models
Accuracy evaluation indicators for decomposition combination models
Models . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
LSTM | RMSE (m3/s) | 48.71 | 70.11 |
NSE | 0.96 | 0.92 | |
BIAS | 0.13 | 0.14 | |
EMD–LSTM | RMSE (m3/s) | 48.63 | 76.19 |
NSE | 0.96 | 0.90 | |
BIAS | 0.09 | 0.06 | |
VMD–LSTM | RMSE (m3/s) | 53.97 | 52.14 |
NSE | 0.96 | 0.96 | |
BIAS | −0.002 | −0.002 | |
WA–LSTM | RMSE (m3/s) | 54.05 | 54.47 |
NSE | 0.96 | 0.95 | |
BIAS | −0.17 | −0.19 |
Models . | Evaluation indicators . | Training period . | Validation period . |
---|---|---|---|
LSTM | RMSE (m3/s) | 48.71 | 70.11 |
NSE | 0.96 | 0.92 | |
BIAS | 0.13 | 0.14 | |
EMD–LSTM | RMSE (m3/s) | 48.63 | 76.19 |
NSE | 0.96 | 0.90 | |
BIAS | 0.09 | 0.06 | |
VMD–LSTM | RMSE (m3/s) | 53.97 | 52.14 |
NSE | 0.96 | 0.96 | |
BIAS | −0.002 | −0.002 | |
WA–LSTM | RMSE (m3/s) | 54.05 | 54.47 |
NSE | 0.96 | 0.95 | |
BIAS | −0.17 | −0.19 |
Comparison of predicted and measured values in the validation period.
The findings of this study have significant implications for environmental protection and climate change adaptation. By improving runoff prediction through advanced machine learning and signal decomposition, we enhance water resource management and flood forecasting. Accurate predictions are vital for addressing climate change impacts, such as flooding and water scarcity (Sarker 2022). These improved models can assist policymakers and engineers in designing resilient infrastructure. Additionally, our approach aids in monitoring watershed health and protecting ecosystems at risk from climate change. Ultimately, our findings support informed decision-making on land use and water conservation, promoting sustainability and environmental stewardship.
CONCLUSIONS
In this study, four single machine learning models (LSTM, SVM, RF, and ANN) and three combined models (EMD–LSTM, VMD–LSTM, and WA–LSTM) were utilized to investigate the runoff of the Huanren station spanning from 1980 to 2020. The training set comprised data from 1980 to 2016, while the validation set encompassed data from 2017 to 2020. The LSTM model's applicability and accuracy were examined, and signal decomposition methods were employed to preprocess the runoff series in an attempt to enhance forecasting accuracy. The following conclusions were drawn:
(1) Among the four single models, the LSTM model demonstrated superior performance in terms of RMSE and NSE during both the training and validation periods, achieving values of 48.71 and 0.96 m3/s, and 70.11 and 0.91 m3/s, respectively. This can be attributed to the LSTM model's gate structure design, which overcomes the gradient disappearance and explosion issues associated with general RNN neural networks and effectively preserves the integrity of rainfall and runoff information in long time series.
(2) To evaluate the prediction accuracy of the four single models specifically on high-flow days, BIAS was introduced as an accuracy evaluation metric. The LSTM model exhibited an absolute BIAS value of 0.0266, which was 81.36, 85.64, and 82.97% lower compared to the SVM, RF, and ANN, respectively. This demonstrates the significant advantage of the LSTM model in forecasting high-flow days.
(3) Among the three combined models (EMD–LSTM, VMD–LSTM, and WA–LSTM), the VMD–LSTM model demonstrated the best overall performance, with RMSE, NSE, and BIAS values of 52.14 m3/s, 0.96, and −0.002, respectively, during the validation period. Notably, the VMD–LSTM model exhibited improved RMSE metrics during the validation period compared to the training period, indicating its ability to avoid overfitting issues and minimize generalization errors in the runoff prediction process.
(4) Comparing the VMD–LSTM model with the single LSTM model, it was observed that all three evaluation metrics displayed improved accuracy during the validation period. The VMD–LSTM model achieved a 98.57% reduction in the absolute value of BIAS compared to the single LSTM model. This highlights the effectiveness of the VMD method in identifying and filtering out noise from the original runoff sequence, thereby enhancing the smoothness and predictability of runoff sequences.
The results indicate that the LSTM model offers distinct advantages over the other three machine learning models in runoff forecasting, and the VMD–LSTM model significantly improves prediction accuracy. However, it should be known that only the previous day's rainfall and runoff are used as the inputs of these models in this study; further research can be undertaken to include some other variables in the inputs, such as humidity, temperature, and a large number of climate indices, which represent initial catchment conditions and the climate. Additionally, other methods such as EMD and singular spectrum decomposition can also be incorporated with the LSTM model to develop combined prediction models for further enhancing the accuracy of runoff predictions.
ACKNOWLEDGEMENTS
This study is supported by the National Key Research and Development Program of China (No. 2023YFC3006602) and the National Natural Science Foundation of China (Grant Nos. 52209028 and 51709108).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.