ABSTRACT
Runoff forecasting is crucial for water resources management, demanding precise models. This study proposes a runoff forecasting model that utilises a hybrid variational mode decomposition (VMD), convolutional neural network (CNN), and long short-term memory network (LSTM) based on the attention mechanism (AM) to enhance the accuracy and stability of runoff forecasting. The volatility of the runoff sequence was significantly reduced by the VMD. The AM focused on extracting the most critical information from the features. The VMD–CNN–AM–LSTM model, using the two-stage decomposition forecasting framework, was used to predict daily runoff at the Jianli (JL) hydrological station in the section from Yichang to JL from 1 October 2006 to 30 October 2022. This model outperformed the model without the hybrid of VMD and/or AM in forecasting runoff, with a root mean square error value of 646.160, a mean absolute error value of 424.124, a mean absolute percentage error value of 2.54%, and an R2 value of 0.9933. Model stability was assessed using the bias-variance, which was found to be significantly more stable than the model without VMD and AM. The VMD and AM optimises runoff forecasting at the target station by utilising upstream stations’ runoff. This improves the accuracy and stability of the model, providing technical support for water resources planning and management.
HIGHLIGHTS
A developed VMD–CNN– attention mechanism (AM)– long short-term memory network (LSTM) hybrid model boosts short-term river runoff forecast precision and stability.
The model captures river runoff's spatio-temporal dynamics through VMD and AM integration.
The model's enhanced predictive stability is verified by bias-variance analysis.
INTRODUCTION
Runoff is a crucial factor influencing water supply for domestic, industrial, and agricultural use and plays a pivotal role in the socio-economic development of a region. The ability to regulate natural runoff is closely linked to the development of industrial and agricultural production, shaping human life and economic progress (Huang et al. 2015). Effective short-term runoff forecasting is essential for informed decision-making in water resource management, providing valuable insights for flood and drought mitigation (Xiao et al. 2022). However, short-term runoff forecasting requires estimation based on past runoff values, taking into account time constraints and various influencing factors. This process has the characteristics of uncertainty, conditionality, and multi-programme (Zhang & Yan 2023). Therefore, the construction of a short-term runoff model that considers spatial and temporal characteristics is crucial for accurate model forecasting.
While time series models like autoregressive models (AR), autoregressive moving average model, and autoregressive integrated moving average model have traditionally been applied in runoff forecasting (Mohammadi et al. 2006; Li et al. 2015; Zurey et al. 2020), their reliance on linear assumptions limits their effectiveness for non-stationary, nonlinear time series data (Fathian 2021). To address this limitation, data-driven models have gained prominence in runoff forecasting. These models include convolutional neural networks (CNN), long short-term memory networks (LSTMs), decision trees, and support vector machines (Erdal & Karakurt 2013; Huang et al. 2014; Zakizadeh et al. 2020; Li et al. 2021). It is important to note that each data-driven model has its own unique characteristics. For instance, CNNs are mainly intended to extract local features from continuous data (Guo et al. 2018), whereas LSTM models are specifically designed to capture temporal dependencies (Yu et al. 2019). The feature extraction capabilities of a single data-driven model for time series are limited. Additionally, the forecasting accuracy of these machine learning methods is constrained by data quality, particularly in the presence of irregular vacancies and noise in real-world load data (Deng et al. 2020; Zhou & Kang 2023).
Hybrid methods combining different algorithms, such as the attention mechanism (AM) with LSTM, have been proposed to improve forecasting accuracy (Ding et al. 2020). The AM assigns different weights to different implicit states, thereby amplifying the influence of crucial information and improving the accuracy of load forecasts (Ding et al. 2020). Another way to improve forecasting accuracy involves the decomposition of the sequence prior to prediction. Common decomposition methods include wavelet transform, empirical mode decomposition (EMD), and variational mode decomposition (VMD). Xiao & Wang (2021) utilized an empirical EMD approach for load sequence decomposition, followed by integrating obtained eigenmodal components into a hybrid neural network for prediction. In comparison, Li et al. (2021) and Zuo et al. (2020a) employed VMD to decompose signals for runoff forecasting, effectively avoiding mode aliasing, simplifying the model, and enhancing prediction accuracy. However, after performing sequence decomposition, these methods tend to use only the decomposed sequence itself to make predictions and do not take into account the influence of external factors (e.g., runoff from upstream stations) on the forecasted runoff.
Accurately predicting runoff involves capturing both spatial and temporal variations. Traditional models often focus on temporal aspects, neglecting spatial dependencies, which are crucial in regions with complex hydrological dynamics. This study addresses this gap by incorporating upstream runoff stations as key spatio-temporal inputs into the VMD–CNN–AM–LSTM hybrid model. The inclusion of these upstream stations allows the model to better account for spatial interactions in runoff processes, thereby improving the accuracy and stability of short-term forecasts. This study focuses specifically on the middle reaches of the Yangtze River, spanning from Yichang (YC) to Jianli (JL). The application of this hybrid model to runoff forecasting in the JL hydrological station demonstrates its potential in providing valuable technical support for water resources management planning in the region.
METHODOLOGY
VMD–CNN–AM–LSTM runoff forecasting model framework
The runoff forecasting model is broken down into four steps:
Step 1: Dataset division. The original time series is divided into a training set and a validation set. A higher ratio helps capture complex temporal dependencies, especially when dealing with long-term data sequences (Joseph 2022). The 9:1 split was chosen to maximize the data available for training, as runoff forecasting can be highly sensitive to the amount of training data.
Step 2: The training set is pre-decomposed using the target station, with the parameters of the signal decomposition algorithm optimised. The validation set Vi is then attached to the training set, and the additional set Ai is decomposed using VMD. Based on the decomposition results of the additional set, additional sample sets are generated, and the last sample in each additional sample set is taken as the validation sample.
Step 3: Forecasting using the CNN–AM–LSTM model. The influencing factor of upstream runoff is introduced. The decomposed modes are separately forecasted, and their results are linearly combined to obtain the final prediction of runoff at the target station. The input data were normalised using z-score normalization.
Step 4: Evaluating the forecast result of the model. Several metrics have been chosen to thoroughly evaluate the performance, stability, and accuracy of the VMD–CNN–AM–LSTM model and other models.
Variational mode decomposition
Variational mode decomposition (VMD) is an adaptive and non-recursive signal decomposition method (Dragomiretskiy & Zosso 2014). It is based on the concepts of Wiener filtering, Hilbert transform, and frequency mixing. The method decomposes the signal into K sub-signals of small relative amplitude in different frequency bands. The algorithm's core idea is to construct and solve the optimal solution of the variational problem. This effectively decomposes low- and high-frequency signals, reducing the high complexity, non-linearity, and non-smoothness of the time series. The specific steps are as follows:
- (1) Establish the constrained variational problem. To ensure minimal estimated bandwidths of each mode and satisfy that the sum of all modes is equal to the original signal, the constrained variational expression is established assuming the original signal f(t) is decomposed into K mode components with different frequency characteristics.where
is the kth mode component of the signal decomposition;
is the frequency centre of the kth mode component;
indicates the Dirac distribution; and
is the convolution operator.
The decomposition effect of VMD can be influenced by K, α, τ, and ε. If K is too small, the extraction of IMF from the original signal may be ineffective, while larger values of K may result in redundancy of IMF information. Smaller values of α may lead to a larger bandwidth, redundancy of information expression, and increased additional noise. Conversely, larger values of α may lead to a smaller bandwidth and loss of effective information. Equation (3) shows that λ ensures optimal convergence of VMD with appropriate values of τ > 0 for low-noise signals, while λ prevents VMD from converging with values of τ > 0 for high-noise signals. To avoid this shortcoming, τ can be set to zero, but this results in some error when reconstructing the decomposed signal by summation. The value of ε affects the reconstruction error of VMD.
Attention mechanism
In predicting runoff, input features vary in their degree of influence on the results. The AM calculates the probability distributions of different input features, focusing on the important ones and disregarding irrelevant information, thereby enhancing the model's ability to utilize key features (Bahdanau et al. 2014). This study employs a soft AM to distribute weight values of attention between 0 and 1. A two-dimensional matrix with m rows and n columns is formed for m input features and n time nodes.
CNN–AM–LSTM model
The VMD–CNN–AM–LSTM model not only decomposes runoff time series data using VMD to handle non-stationary and nonlinear characteristics but also integrates upstream runoff data to capture spatial dependencies. The CNN component further enhances this by extracting localized features from the input data. The AM then assigns weights to critical features, ensuring that both spatial (runoff from upstream stations) and temporal (lagged runoff values) dependencies are effectively captured in the prediction process.
STUDY AREA AND DATA
The hydrological dynamics in this segment are complex, with numerous inflow and outflow processes in the mainstream. This complexity poses challenges for traditional runoff forecasting methods.
There are four hydrological stations along the YC–JL section: YC, Zhicheng (ZC), Shashi (SS), and JL. The analysis will cover the period from 1 October 2006 to 30 October 2022, which coincides with the completion of the Three Gorges Project. The total number of data points is 5,145. To ensure accuracy, we compared flow series data and found similar hydrological conditions from YC to JL. Therefore, there is no need to adjust the measured runoff at each station for processing.
Our previous studies have indicated that the travel time for runoff from YC to JL is approximately 3 days (Zhou & Kang 2023). To enhance the accuracy of our runoff forecasting model, we consider the influence of the first 3 days of runoff from three upstream stations (YC, ZC, and SS) when predicting the JL station. This is because the characteristics of upstream runoff can impact the size and timing of daily load curves for regional runoff, which can result in more uncertain load data.
RESULTS
VMD decomposition
The VMD is parameterised using the training set. In general, selecting α = 2,000, τ = 0, ε = 1 × 10−7 can provide optimal denoising and effective separation of intrinsic mode functions (IMFs). The only parameter that requires adjustment is the number of mode components, denoted as K. The primary distinction between different modes lies in their centre frequencies. Thus, the selection of the appropriate K-value is determined by analyzing the centre frequency distribution across various mode numbers. Table 1 displays the centre frequencies at different mode numbers. Notably, the centre frequencies for K = 4 (0.000230, 0.014072, and 0.039768 Hz) closely resemble those for K = 3 (0.00243, 0.016477, and 0.049812 Hz).
The centre frequency of different mode numbers
Mode numbers . | Centre frequency (Hz) . | |||||
---|---|---|---|---|---|---|
2 | 0.000276 | 0.022084 | ||||
3 | 0.000243 | 0.016477 | 0.049812 | |||
4 | 0.000230 | 0.014072 | 0.039768 | 0.072638 | ||
5 | 0.000132 | 0.00561 | 0.021272 | 0.0479908 | 0.087900 | |
6 | 0.0000710 | 0.004108625 | 0.019258536 | 0.043177531 | 0.072183613 | 0.12828097 |
Mode numbers . | Centre frequency (Hz) . | |||||
---|---|---|---|---|---|---|
2 | 0.000276 | 0.022084 | ||||
3 | 0.000243 | 0.016477 | 0.049812 | |||
4 | 0.000230 | 0.014072 | 0.039768 | 0.072638 | ||
5 | 0.000132 | 0.00561 | 0.021272 | 0.0479908 | 0.087900 | |
6 | 0.0000710 | 0.004108625 | 0.019258536 | 0.043177531 | 0.072183613 | 0.12828097 |
To further determine the value of K, the correlation between adjacent mode components was analysed using different numbers of modes. Table 2 displays the correlation coefficients between adjacent modes. When K is less than 3, the correlation coefficients between adjacent modes decrease sequentially, which indicates that the mode decomposition is normal; once K becomes 4, the level of correlation coefficient between adjacent modes, after decreasing, appears to increase, which indicates that the mode components begin to overlap (Table 2). Therefore, for the VMD decomposition in this study, the value of K was set to 3.
Correlation coefficients of adjacent modes
Mode numbers . | Correlation coefficient . | ||||
---|---|---|---|---|---|
C12 . | C23 . | C34 . | C45 . | C56 . | |
2 | 0.13216 | ||||
3 | 0.175878 | 0.111295 | |||
4 | 0.231503 | 0.138274 | 0.151127 | ||
5 | 0.278946 | 0.052338 | 0.105346 | 0.129991 | |
6 | 0.242677 | 0.051622 | 0.116159 | 0.14685 | 0.113728 |
Mode numbers . | Correlation coefficient . | ||||
---|---|---|---|---|---|
C12 . | C23 . | C34 . | C45 . | C56 . | |
2 | 0.13216 | ||||
3 | 0.175878 | 0.111295 | |||
4 | 0.231503 | 0.138274 | 0.151127 | ||
5 | 0.278946 | 0.052338 | 0.105346 | 0.129991 | |
6 | 0.242677 | 0.051622 | 0.116159 | 0.14685 | 0.113728 |
Effectiveness testing of AM
Forecasting performance of different model
Models . | RMSE . | MAE . | R2 . | MAPE(%) . |
---|---|---|---|---|
CNN–AM–LSTM | 895.6738 | 518.2583 | 0.9884 | 2.87 |
CNN–LSTM | 922.8554 | 571.1692 | 0.9886 | 3.13 |
LSTM | 852.9706 | 565.6301 | 0.9876 | 3.60 |
CNN | 842.4297 | 557.9231 | 0.9890 | 3.56 |
VMD–CNN–AM–LSTM | 646.1602 | 424.1244 | 0.9933 | 2.54 |
VMD–CNN–LSTM | 790.7234 | 490.6196 | 0.9896 | 2.93 |
VMD–LSTM | 744.045 | 520.9984 | 0.9912 | 3.29 |
VMD–CNN | 808.1079 | 483.4139 | 0.9913 | 2.68 |
Models . | RMSE . | MAE . | R2 . | MAPE(%) . |
---|---|---|---|---|
CNN–AM–LSTM | 895.6738 | 518.2583 | 0.9884 | 2.87 |
CNN–LSTM | 922.8554 | 571.1692 | 0.9886 | 3.13 |
LSTM | 852.9706 | 565.6301 | 0.9876 | 3.60 |
CNN | 842.4297 | 557.9231 | 0.9890 | 3.56 |
VMD–CNN–AM–LSTM | 646.1602 | 424.1244 | 0.9933 | 2.54 |
VMD–CNN–LSTM | 790.7234 | 490.6196 | 0.9896 | 2.93 |
VMD–LSTM | 744.045 | 520.9984 | 0.9912 | 3.29 |
VMD–CNN | 808.1079 | 483.4139 | 0.9913 | 2.68 |
VMD–CNN–AM–LSTM, VMD–CNN–LSTM, VMD–LSTM, and VMD–CNN prediction results.
The forecasting results of the CNN and LSTM models are slightly superior to those of the CNN–LSTM model. This could be attributable to the fact that the CNN is primarily employed to extract the local features of the sequence data, while the LSTM is used to capture the time dependence. There may be information loss when employing the output of the CNN as input to the LSTM. Since the CNN downsamples and extracts features from the input sequence, the loss of detailed information may occur if there is excessive noise. This may be crucial for the temporal modelling of the LSTM. Therefore, reducing the impact of noise can improve the model's simulation accuracy.
Effectiveness testing of VMD
To validate the efficacy of VMD decomposition in enhancing the accuracy of simulation, four models were constructed: VMD–CNN–AM–LSTM, VMD–CNN–LSTM, VMD–LSTM, and VMD–CNN. Figure 6 illustrates the forecasting results. It can be seen that all four models predicted the runoff trend more accurately and that the data accuracy was enhanced by decomposition in comparison to Figure 5.
Table 3 presents an evaluation and listing of the overall predictive performances of each model. The assessment criteria indicate that the VMD–CNN–AM–LSTM model exhibits the highest accuracy, with all indices exhibiting considerable improvement and the MAPE at 2.54%. This value is 0.33% lower than the value obtained without the use of VMD decomposition. The results indicate that the utilisation of a hybrid of VMD and AM effectively enhances the prediction accuracy of the daily runoff curve's peak and valley areas in comparison to a hybrid without VMD. This is attributed to the reduction in the complexity of the load sequence and enhancement of the prediction model's ability to capture stochastic variations, which is enabled by VMD. Furthermore, the effective utilisation of the decomposed sequence's characteristics by AM highlights the significance of the hybrid of VMD and AM in predicting the complexity of the factors.
Forecasting stability
DISCUSSION
Overall performance
Short-term runoff forecasting models play an important role in flood control and water supply planning. This study introduces VMD and AM to the field of short-term runoff forecasting and demonstrates the potential of the VMD–CNN–AM–LSTM model, especially in accurately predicting peak runoff areas. Our proposed VMD–CNN–AM–LSTM model has shown promising results in runoff forecasting, outperforming other models in terms of RMSE, MAE, R2, and MAPE. This suggests that the integration of VMD and AM has contributed to the improved accuracy and stability of the model.
Comparison with previous studies
Previous studies on runoff forecasting have employed various hybrid models, including VMD–LSTM–gradient boosting decision tree (GBDT) (Sun et al. 2022) and ensemble empirical mode decomposition (EEMD)–LSTM (Zuo et al. 2020b). For instance, Sun et al. applied the VMD–LSTM–GBDT model, achieving an R² of 0.989, while Zuo et al. reported a performance of 0.9366 using the EEMD–LSTM–GBDT model. In comparison, our VMD–CNN–AM–LSTM model achieved an R2 of 0.9933, demonstrating superior predictive accuracy. This improvement is attributed to the integration of the AM, which enhances feature extraction and the model's ability to focus on crucial temporal dependencies in the runoff data.
The hybrid models used in previous studies, such as EMD–LSTM or wavelet-based models (Li et al. 2021; Xiao & Wang 2021), focused on sequence decomposition but did not incorporate spatial dependencies or attention mechanisms. In contrast, our VMD–CNN–AM–LSTM model benefits from the CNN's ability to capture local spatial features, while the AM dynamically assigns weights to significant inputs, enhancing temporal feature selection. This hybrid integration allows our model to reduce noise more effectively and capture complex patterns, outperforming models that rely solely on temporal decomposition (e.g., VMD or EMD).
While most previous studies have focused primarily on temporal aspects of runoff forecasting, our model integrates both spatial and temporal dimensions by incorporating runoff data from upstream stations. This approach allows the VMD–CNN–AM–LSTM model to better capture the spatio-temporal complexities of runoff, as evidenced by the model's superior performance across all evaluation metrics. For example, the BV decomposition analysis shows that our model offers greater stability compared to models like VMD–LSTM–GBDT (Sun et al. 2022), making it particularly suitable for regions with complex hydrological dynamics.
Our study advances the field of runoff forecasting by introducing a hybrid model that combines VMD, CNN, AM, and LSTM. This comprehensive approach addresses both temporal and spatial complexities, offering improved accuracy and stability. The integration of VMD and AM, in particular, has proven to be a significant advancement, enabling more reliable forecasting, especially in peak runoff periods, as compared to earlier models that focus solely on temporal decomposition or lack attention mechanisms.
Influence of sub-models
It is important to note that increasing the complexity of a model does not always lead to improved performance. The selection of model complexity should be guided by the characteristics of the given dataset (Ng et al. 2023). An analysis of the effectiveness of AM revealed that the CNN–AM–LSTM model did not exhibit superior accuracy when compared to the CNN–LSTM model. Incorporating AM does not necessarily guarantee improved accuracy in runoff prediction. The weight assignment of the AM may introduce additional errors, particularly in regions with complex hydrological processes, increased uncertainty during flood seasons, and significant time series noise, where it may struggle to accurately simulate peak conditions.
When comparing the CNN–LSTM model to the CNN and LSTM models, it was found to be more stable, although its simulation accuracy was slightly lower than that of the individual CNN and LSTM models. This discrepancy can be attributed to the fact that CNN is primarily designed to extract localized features from sequential data (Guo et al. 2018), while LSTM is specifically designed to capture temporal dependencies (Yu et al. 2019). The use of the output of the CNN as the input for the LSTM may result in the loss of critical information. The CNN downsamples the input sequences and extracts features, which may omit detailed information necessary for the LSTM's temporal modelling. Consequently, reducing noise can effectively improve the accuracy of our model's simulations.
In this study, we integrated upstream hydrological data in addition to AM and VMD, thereby enhancing the physical reliability of runoff predictions for the target stations. The incorporation of historical station data and upstream hydrological data has facilitated a deeper understanding of the correlations between runoff and temporal rate changes across different time frequencies, as noted by Song et al. (2017). Furthermore, the inclusion of upstream hydrological data in our predictions enhances operational feasibility, as highlighted by Chen et al. (2020). Ahmed et al. (2022) conducted research on the use of meteorological data, such as rainfall and sunlight, to predict runoff decomposition data in rivers, which yielded positive results. However, short-term river forecasting can be challenging due to the collection and processing of meteorological data. The correlations between meteorological elements and runoff may vary across different basins, which can significantly impact model accuracy (Nguyen-Huy et al. 2017; Ahmed et al. 2021). Therefore, our study focuses on the use of upstream runoff data for predictions, which offers significant advantages in terms of data processing and collection.
Although this study does not include a full quantitative sensitivity analysis, it is clear from the model's design that certain input factors significantly influence its performance. The upstream runoff data, particularly from the YC, ZC, and SS stations, is crucial for the model's accuracy, as these stations capture spatial dependencies that affect the runoff at the downstream JL station. The timing and magnitude of upstream runoff during flood periods play a decisive role in determining prediction accuracy.
Additionally, the VMD-decomposed components help mitigate noise in the runoff data, improving the model's stability. Variations in the quality of these components could potentially affect the model's predictive ability, particularly during periods of low flow when noise is more prevalent. The AM incorporated in the model further enhances performance by assigning appropriate weights to the most relevant features, ensuring that the model remains resilient to fluctuations in input quality.
Limitations and prospects
While our model shows promising results, it is important to acknowledge its limitations. The accuracy of predictions may be influenced by uncertainties in input data, and the model's performance may vary under different hydrological conditions. Additionally, the selection of K in VMD introduces some subjectivity, and further research could explore robust methods for determining this parameter. During the analysis of the runoff data, we observed significant autocorrelation, especially during high-flow periods. This autocorrelation suggests that past runoff values have a strong influence on future values, which the VMD–CNN–AM–LSTM model captures effectively in short-term predictions. However, our focus remains on immediate, short-term forecasting for operational water resource management. For future studies, exploring longer lead times and the impact of autocorrelation on extended predictions could provide additional insights into the temporal dynamics of runoff. To further improve runoff forecasting models, future research could focus on incorporating additional environmental factors, exploring the impact of climate change, and refining the AM. Furthermore, an investigation into the transferability of the model to different regions and the integration of real-time data could contribute to its practical application.
CONCLUSION
This study introduces a hybrid VMD–CNN–AM–LSTM model incorporating upstream hydrological station runoff sequences that is introduced for short-term runoff prediction. The model's simulation accuracy and stability are assessed. The key findings are as follows:
(1) VMD preprocessing reduces data randomness and non-stationarity, enhancing predictive accuracy. Leveraging CNN for feature extraction, coupled with AM emphasizing key features, not only captures vital information efficiently but also improves training efficiency and reduces time.
(2) The incorporation of upstream station runoff features enables the model to accommodate load fluctuations due to temporal changes, thereby enhancing prediction accuracy across various runoff periods.
(3) The evaluation of the model using RMSE, MAE, R2, MAPE, and BV demonstrates that it outperforms conventional methods and hybrid models in terms of prediction accuracy and stability.
This model, which incorporates VMD, CNN, AM, and LSTM, is designed to capture the spatial and temporal characteristics of runoff, thereby enhancing the accuracy and stability of forecasting. The application of this hybrid model to runoff forecasting at the JL hydrological station provides evidence of its potential to provide valuable technical support for water resources management planning in the region.
CONSENT TO PARTICIPATE
No human participant is involved in this study.
CONSENT FOR PUBLICATION
All authors have read and agreed to publish the manuscript in this version.
AUTHOR CONTRIBUTIONS
Conceptualized by H.C. and L.K.; rendered support of funding acquisition of L.K.; development of methodology by H.C. and W.Z.; visualized by Y.W., J.Y., and R.Q.; wrote the original draft by H.C.; wrote the reviewed and edited by L.K. and L.Z.
FUNDING
This work was supported by the National Key Research and Development Program of China (Grant No. 2022YFC3002704) and China Yangzi Power Co., Ltd (Z242302051).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.