ABSTRACT
The imperative for a reliable and accurate flood forecasting procedure stems from the hazardous nature of the disaster. In response, researchers are increasingly turning to innovative approaches, particularly machine learning models, which offer enhanced accuracy compared to traditional methods. However, a notable gap exists in the literature concerning studies focused on the South Asian tropical region, which possesses distinct climate characteristics. This study investigates the applicability and behavior of long short-term memory (LSTM) and transformer models in flood simulation considering the Mahaweli catchment in Sri Lanka, which is mostly affected by the Northeast Monsoon. The importance of different input variables in the prediction was also a key focus of this study. Input features for the models included observed rainfall data collected from three nearby rain gauges, as well as historical discharge data from the target river gauge. Results showed that the use of past water level data denotes a higher impact on the output compared to the other input features such as rainfall, for both architectures. All models denoted satisfactory performances in simulating daily water levels, with Nash–Sutcliffe Efficiency (NSE) values greater than 0.77 while the transformer encoder model showed a superior performance compared to encoder–decoder models.
HIGHLIGHTS
Past water level data had the highest impact on the output among all the input features.
It is recommended to use many input features that show high correlations with the target, along with the past data of the output observations.
switching the inputs between the encoder and decoder and checking the accuracies for these two cases are recommended when using Transformer encoder-decoder models.
INTRODUCTION
Natural catastrophes, such as hurricanes, earthquakes, and floods lead to significant economic, ecological, and social damages and casualties. Among them, flood can be identified as a phenomenon with severe effects, impacting about 109 million people throughout the world between 1995 and 2015 (Hirabayashi et al. 2013; Alfieri et al. 2017). As percentages, about 55% of people and 43% of all events were impacted by floods with lost assets totaling over 636 billion USD (Serinaldi et al. 2018) encouraging researchers worldwide to mitigate this disaster (Hallegatte et al. 2017). When considering South Asian Tropical regions, which are highlighted due to the rapid occurrence of seasonal reversal of the wind direction accompanied by intense precipitation, and resultant wet summers and dry winters (Xie & Saiki 1999), large-scale floods and droughts can be expected (Parthasarathy & Mooley 1978). One viable option in flood hazard management is practical and effective flood warning systems (Boulange et al. 2021). Early prediction of floods facilitates timely management of hydro-junction operations and fast evacuation of individuals from flood-affected regions, leading to a reduction in socioeconomic losses (Zhang et al. 2022).
A significant challenge in advancing flood forecasting technology is the limited availability of field data. Usually, flood prediction approaches can be divided into two categories, physically based models, and data-driven models. Physical models (Pierini et al. 2014; Mourato et al. 2021) often require a substantial amount of both hydrological and geomorphological data for calibration and validation, and they might not always be readily accessible. Furthermore, the model parameters must be carefully tested and evaluated, because they are regionally dependent and can be challenging to estimate.
To overcome these limitations, machine learning-based data-driven models have gained popularity in flood forecasting because of their ability to capture complex nonlinear patterns, cope with limited data effectively (Rahmati & Pourghasemi 2017), and ability to capture spatial data from images (Lee et al. 1990). These models can be effectively implemented solely based on available rainfall data and measured discharge data, without the need for detailed catchment characteristics. Artificial neural network (ANN) is a common algorithm for flood simulation because it has outperformed traditional methods on many occasions (Elsafi 2014; Chu et al. 2020; Tamiru & Dinka 2021). Then, recurrent neural network (RNN) was introduced for time series forecasting tasks with the ability of capturing essential information from long sequences of data. Long short-term memory (LSTM), which is a special type of RNN, have gained significant popularity and widespread adoption in hydrologic prediction tasks (Xiang et al. 2020; Fang et al. 2021; Zou et al. 2023; Dtissibe et al. 2024).
As a solution for some limitations of these traditional neural network algorithms such as low computational speed and ineffectiveness of capturing long-term dependencies, google initiated a new architecture called transformers (Vaswani et al. 2017) which is based on an attention mechanism (Bahdanau et al. 2014). Although this was originally designed for natural language processing (NLP), the transformer model has denoted its effectiveness in handling other types of time series data (Wu et al. 2020; Farsani & Pazouki 2021).
In the realm of flood forecasting, there is a scarcity of studies that incorporate transformer architecture, representing a notable gap in the literature. Moreover, existing research indicates that the accuracy comparison of models, including transformers, varies across different regions. For example, Xu et al. (2023) introduced the transfer learning (TL)-transformer framework to enhance flood prediction accuracy in basins with limited data by utilizing models trained on data-rich basins. The research was centered on the middle reaches of the Yellow River. The findings showed that the TL-transformer outperformed other models, such as TOPMODEL, Gated Recurrent Unit (MLP), TL-MLP, LSTM, TL-LSTM, and transformer, at all the target basin stations. However, the conclusions of the study of Wei et al. (2023) say otherwise. They conducted a study evaluating the performance of transformer (TSF), LSTM, and Gated Recurrent Unit (GRU) models for runoff prediction in the Yangtze River Basin in China. The results showed that GRU outperformed the other models with fewer parameters, while TSF faced challenges due to data limitations. Therefore, whether the transformer model outperforms the traditional LSTM model in forecasting flood events remains crucial. This provides the motivation and contribution to the present research.
We examined 1 day ahead flood forecasting capabilities of transformer encoder, transformer encoder–decoder, and LSTM. The lack of studies regarding deep learning-based flood simulation in the South Asian Tropical Zone further motivated us to do this study. This paper is related to our study (Madhushanka et al. 2024), which focused on the lower reach of the Mahaweli catchment, in Sri Lanka. In this paper, apart from the forecasting capabilities of daily water levels, the effects of different input features on the output are thoroughly investigated.
METHODOLOGY
Study area
South Asian tropical climate
South Asia, encompassing Afghanistan, Pakistan, India, Nepal, Bhutan, Bangladesh, Maldives, and Sri Lanka, stands as the world's most populous and agriculture-dependent region. The climate in the South Asian region is characterized by the South Asian Monsoon, a significant seasonal phenomenon marked by dramatic shifts in winds that bring vital rainfall to the area. Unlike regions with consistent precipitation year-round, South Asia experiences distinct dry and wet seasons. The summer monsoon, occurring from June to September, carries moisture-laden winds from the Indian, resulting in heavy rainfall. Conversely, the winter monsoon, from December to February, brings dry continental winds from the north (Xie & Saiki 1999). Additionally, there are two inter-monsoon seasons, the first inter-monsoon from March to May and the second inter-monsoon from October to November (Wickramagamage 2016).
South Asian countries are particularly susceptible to temperature and precipitation extremes, including floods and droughts, due to the effects of global warming (Naveendrakumar et al. 2019). The frequency of intense precipitation events with the potential for extreme outcomes is projected to rise across various regions in South Asian countries (Christensen et al. 2007) while Central Asia is expected to have less rainfall compared to the past (Donat et al. 2016). Given these factors, there is an urgent need to establish a reliable flood forecasting procedure to prevent future catastrophes.
Mahaweli catchment
The Mahaweli River, the longest river in Sri Lanka, stretches 335 km in length, originating from the central hills of the country as a collection of numerous small creeks. It traverses through the central region of Sri Lanka before reaching its terminus at the southwestern side of Trincomalee Bay, where it merges with the Bay of Bengal. The Mahaweli River Basin (MRB) is the largest river basin in Sri Lanka, covering an area of approximately 10,448 km2, which represents about 16% of the country's total land area (Diyabalanage et al. 2016). The runoff from the Mahaweli River contributes to one-seventh of the total runoff of all rivers in Sri Lanka, with an average annual streamflow of 8.8 × 109 m3 (De 1997).
The distribution of rainfall within the MRB is uneven both spatially and temporally due to its topographical features. The MRB can be divided into two main parts based on topography: the Upper Mahaweli Basin (UMB) and the Lower Mahaweli Basin (LMB) (Hewawasam 2010). The UMB, situated in the western part of the central highlands, experiences a total annual precipitation of around 6,000 mm (Zubair 2003). Conversely, most parts of the LMB are classified as dry regions, such as the North Central and Eastern provinces, with mean annual precipitation ranging from about 1,600 to 1,900 mm. The precipitation in the UMB is primarily influenced by the southwest monsoon, while the precipitation in the LMB is affected by the northeast monsoon (NEM), owing to the intricate terrain and monsoon patterns in Sri Lanka (Shelton & Lin 2019). We selected the LMB as our study area because the climatic data of the region align well with the climatic characteristics of the South Asian Tropical zone. This choice ensures that the case study accurately reflects the conditions typically observed in this region, enhancing the relevance and applicability of our research findings.
The dataset
. | Rainfall (mm) . | Water level (m) . | ||
---|---|---|---|---|
. | Aralaganwila . | Angamedilla . | Polonnaruwa Agri . | Manampitiya . |
Count | 10,319 | 10,319 | 10,319 | 10,319 |
Mean | 4.94584 | 4.820227 | 4.20901 | 33.382384 |
Std | 15.08369 | 15.45192 | 13.48675 | 0.643105 |
Min | 0 | 0 | 0 | 32.196 |
Max | 225.8 | 222 | 184 | 37.254333 |
. | Rainfall (mm) . | Water level (m) . | ||
---|---|---|---|---|
. | Aralaganwila . | Angamedilla . | Polonnaruwa Agri . | Manampitiya . |
Count | 10,319 | 10,319 | 10,319 | 10,319 |
Mean | 4.94584 | 4.820227 | 4.20901 | 33.382384 |
Std | 15.08369 | 15.45192 | 13.48675 | 0.643105 |
Min | 0 | 0 | 0 | 32.196 |
Max | 225.8 | 222 | 184 | 37.254333 |
Based on the boxplots presented in Figure 2, we can discern the influence of the NEM on the region. Upon closer examination, it becomes evident that while the maximum precipitation occurs from December to February, there are also instances of extreme rainfall events throughout the year. Although some of these events are classified as outliers, it is not advisable to use preprocessing techniques such as the interquartile range (IQR) (Granata et al. 2022; Luppichini et al. 2022) to remove them, as some of these points might be attributed to unexpected extreme weather conditions. Such occurrences, including sudden storms with heavy rainfall, are common in Sri Lanka due to the influences of the Bay of Bengal. Furthermore, consistent patterns observed in monthly precipitation and water level data indicate a strong correlation between them. This correlation underscores the interdependence of precipitation patterns and water levels in the region, emphasizing the importance of considering both variables in this flood simulation task.
As outlined in our previous paper (Madhushanka et al. 2024), the analysis was conducted using rainfall data from the designated rain gauge stations along with the water level at Manampitiya, based on the Pearson correlation coefficient calculated among the four stations. Because the other upstream river gauges belong to the UMB, a region with different geographic and climatic conditions, they were considered not to be used. Prior to their use as inputs and labels for the models, the data underwent normalization using the standard scaling method, which involves considering the mean and standard deviation values of each variable. Subsequently, the dataset was split into the training set and the test set. Specifically, the first 70% of the data was allocated for the training set, while the remaining portion was reserved for the test set to evaluate model performance.
Utilized models
Long short-term memory
Here, ft represents the resulting vector with values ranging from 0 to 1 and ht−1 denotes the hidden state at time step t − 1. σ (.) denotes the sigmoid function. The parameters wf and uf, are adjustable weight matrices and bf is the bias vector in the forget gate. The symbol ⊙ denotes element-wise multiplication. Similar to traditional RNNs, the hidden state h is initialized with a vector of zeros of a predefined length in the first timestep.
tanh (.) denotes the hyperbolic tangent function, and wi, wc, ui, uc, bi, and bc represent another set of learnable parameters. Additionally, the second gate, denoted by a green rectangle, known as the input gate or compute gate, determines the extent to which the information from ct is utilized for updating the cell state in the current time step.
Here, it represents a vector with values ranging from 0 to 1, and wo, uo, and bo represent a set of learnable parameters specific to the output gate. By combining the results obtained from the previous equations, the new hidden state (ht) is calculated using the information derived from this vector. Particularly, the cell state (ct) is responsible for learning long-term dependencies effectively. It can retain information unchanged over an extended number of time steps due to its simple linear interactions with the rest of the LSTM cell.
Transformer and attention mechanism
Additionally, positional information of elements in the sequence is achieved through static positional encodings using Sine and Cosine functions, and then they are added to the original input embeddings. The original transformer consists of an encoder–decoder architecture as depicted in Figure 2. In the decoder, a casual mask is used when calculating the self-attention, preventing each token to attend their future ones. In contrast to self-attention, cross-attention, also known as ‘encoder–decoder attention,’ is used to capture relationships between tokens in different input sequences. The output of the encoder is transformed into the query and the key matrices, and the output of the self-attention block of the decoder is transformed into the key matrix in order to calculate the attention as mentioned before.
Evaluation metrics
Lag correlation
In hydrology, the timing misalignment between predicted and observed data is an issue. This temporal discrepancy can be quantified and analyzed using lag correlation metrics, providing insights into the timing errors. The process involves computing a selected error matrix (e.g., RMSE), then systematically adjusting or lagging one of the time series relative to the other and recomputing the error matrix (Jackson et al. 2019). This approach indicates the time lag at which the correlation or similarity is maximized between observations and predictions. Hyndman & Khandakar (2008) utilized lag correlation measures to gain insights into their dataset's ability to capture specific events, despite the timing variations.
In our case, we used the RMSE value to determine the lag correlation between simulations and observations with 1-day lag time. We calculated the percentage deviation of RMSE between the original and the lagged prediction series in order to quantify the lag correlation.
Experimental setup
The work was conducted using the Python programming language, with data preprocessing, management, and visualization carried out using libraries such as NumPy, Pandas, Scikit-learn, Matplotlib, and Seaborn. For deep learning tasks, the TensorFlow framework was employed. Training of the models was conducted on the Google Colab platform, which provides a cloud-based environment for running Python code, particularly well-suited for machine learning and deep learning tasks. Historical data of the three rain gauges and the target river gauge were used as inputs to forecast the following day's water level using the sliding window method. Hyperparameters for the models were kept the same as in our previous paper (Madhushanka et al. 2024). The transformer architecture employed has been slightly modified from the original implementation presented in Vaswani et al. (2017). Notably, the original input data were used directly without being mapped to an embedding vector, which was done considering the continuity of the data for this regression task. Additionally, the mask of the self-attention layer in the decoder was omitted, thereby allowing time series data to access their successors. However, other aspects such as positional encoding, the number of layers and attention heads, and the dropout rate were kept consistent with the specifications outlined in the original paper. All the hyperparameters are shown in Table 2. ‘Early Stopping’ was used as the regularization for all the LSTM and transformer models.
Hyperparameter . | Value . |
---|---|
Batch size | 32 |
Sliding window size | 15 |
Number of LSTM units in the hidden layer | 64 |
Optimizer | Adam |
Activation function | ReLU |
Validation split | 0.1 |
Learning rate | 0.001 |
For the transformer | |
dmodel (Number of units in the final layer of the feedforward block) | 64 |
dff (Number of units in the final layer of the feedforward block) | 192 |
Number of layers | 6 |
Number of heads | 8 |
Dropout rate | 0.1 |
Hyperparameter . | Value . |
---|---|
Batch size | 32 |
Sliding window size | 15 |
Number of LSTM units in the hidden layer | 64 |
Optimizer | Adam |
Activation function | ReLU |
Validation split | 0.1 |
Learning rate | 0.001 |
For the transformer | |
dmodel (Number of units in the final layer of the feedforward block) | 64 |
dff (Number of units in the final layer of the feedforward block) | 192 |
Number of layers | 6 |
Number of heads | 8 |
Dropout rate | 0.1 |
As the first task, we studied the contributions from each input feature to the final output by considering the following input combinations. Three LSTMs and three transformer encoders were utilized for this task with the same model architectures, except for the input layer.
i. Case 1 – Past water levels and rainfall data as inputs
ii. Case 2 – Past water levels as the only input
iii. Case 3 – Past rainfall data as the only input
RESULTS AND DISCUSSION
Impact of input features on simulation of daily water level
According to the results (blue and green bars) in Figure 7, Case 3 has the highest error and case 2 denotes the lowest error while the error of Case 1 is close to Case 2, for both LSTM and transformer algorithms. For the LSTM, RMSE performance was improved by 47% in Case 3 compared to Case 1 while 12% from Case 1 to Case 2. For the transformer, they were 39 and 9%, respectively, denoting a similar behavior. This high improvement from Case 3 to Case 1 and comparatively small improvement from Case 1 to Case 2 indicate a higher impact of the past water level data on the output, among all the input features.
When considering the LSTM models (blue and orange bars), there is a large reduction of RMSE between the actual and the lagged scenarios in Case 1 and 2 while Case 3 does not show a lag correlation. RMSE of the LSTM was dropped by 76% in Case 1 and 51% in Case 2, while the transformer encoder (green and red bars) had values of 67 and 46%, respectively, indicating the highest percentage reduction for the univariate analysis (Case 1). These percentage values exhibit the impact of the 1-day lagging of the prediction series. This error reduction might happen when the models allocate a higher weightage to the final timestep of the input series, which was generated by the sliding window method. It explains the univariate analysis showing the highest RMSE reduction while Case 3, which used only rainfalls as inputs, showed no reduction when lagging the prediction series by 1 day. We can suggest that a low lag correlation is better because one can argue that the prediction series is developed by solely shifting the observation series by a time offset. Case 2 showed the lowest RMSE value as well as a low lag correlation compared to Case 1. Therefore, we used past rainfall and water level data for the upcoming models based on these results.
Model performance
As shown in Figures 8–11, all multivariate models performed relatively well in simulating the average water levels of the Mahaweli River. However, they differ greatly when it comes to simulating both upward and downward peaks in streamflow. It should be noted that both transformer encoder–decoder models significantly underestimated the peak water levels, mostly for daily water levels exceeding 35.5 m, while clear overestimations were observed for low water levels, especially for water levels less than 33 m. Among the models tested, LSTM and transformer encoder showed superior performance in simulating the peak water levels, while the other two models performed poorly in this regard.
Table 3 exhibits the evaluation indices of the developed models to compare their performances. Four evaluation metrics (RMSE, MAPE, NSE, and R2) were utilized for measuring the forecasting capabilities of the models. During the training period, RMSE ranged from 18.09 to 23.59, MAPE varied from 0.282 to 0.392%, and R2 varied from 0.8743 to 0.9185. All models had NSE greater than 0.86. LSTM achieved an NSE of 0.9183 and RMSE of 18.09, with R2 of 0.9185 and a MAPE of 0.282%. For the transformer encoder, those values were 0.9010, 19.90, 0.9014, and 0.314%, respectively. Both transformer encoder–decoder models denoted a poorer performance compared to the others.
Period . | Model . | RMSE (cm) . | MAPE (%) . | NSE . | R2 . |
---|---|---|---|---|---|
Training | LSTM | 18.09 | 0.282 | 0.9183 | 0.9185 |
Transformer Encoder | 19.90 | 0.314 | 0.9010 | 0.9014 | |
Transformer Enc → WL Dec → RF | 21.51 | 0.338 | 0.8844 | 0.8928 | |
Transformer Enc → RF Dec → WL | 23.59 | 0.392 | 0.8609 | 0.8743 | |
Testing | LSTM | 27.83 | 0.449 | 0.7971 | 0.8026 |
Transformer Encoder | 27.83 | 0.471 | 0.7972 | 0.7985 | |
Transformer Enc → WL Dec → RF | 29.10 | 0.527 | 0.7782 | 0.7828 | |
Transformer Enc → RF Dec → WL | 29.19 | 0.52 | 0.7768 | 0.7901 |
Period . | Model . | RMSE (cm) . | MAPE (%) . | NSE . | R2 . |
---|---|---|---|---|---|
Training | LSTM | 18.09 | 0.282 | 0.9183 | 0.9185 |
Transformer Encoder | 19.90 | 0.314 | 0.9010 | 0.9014 | |
Transformer Enc → WL Dec → RF | 21.51 | 0.338 | 0.8844 | 0.8928 | |
Transformer Enc → RF Dec → WL | 23.59 | 0.392 | 0.8609 | 0.8743 | |
Testing | LSTM | 27.83 | 0.449 | 0.7971 | 0.8026 |
Transformer Encoder | 27.83 | 0.471 | 0.7972 | 0.7985 | |
Transformer Enc → WL Dec → RF | 29.10 | 0.527 | 0.7782 | 0.7828 | |
Transformer Enc → RF Dec → WL | 29.19 | 0.52 | 0.7768 | 0.7901 |
During the testing period, all models demonstrated satisfactory performance in simulating daily water levels, with NSE values greater than 0.7768, although the performances are lower compared to the training phase. The LSTM and transformer encoder showed similar performances with RMSE, MAPE, NSE, and R2 values of 27.83, 0.449%, 0.7971, 0.8026 and 27.83, 0.471%, 0.7972, 0.7985, respectively. Transformer Enc – WL Dec – Rainfall (RF) and Enc – RF Dec – Water Level (WL) denoted similar evaluation values with an RMSE of 29.10 and 29.19 and NSE of 0.7782 and 0.7768, respectively. Transformer encoder exhibited improved performance compared to encoder–decoder models.
Analysis of attention mechanism in transformer
The plot consists of eight subplots, corresponding to the eight attention heads. Each subplot visualizes the attention weights, where the x-axis represents the rainfall data fed into the encoder, and the y-axis represents the water level data fed into the decoder. The color intensity indicates the magnitude of the attention weights, with darker colors representing lower attention weights and brighter colors representing higher attention weights.
Key observations from the attention plot reveal several important aspects. First, there is a noticeable concentration of attention around 10 January across several heads, particularly in Heads 1, 2, 5, and 6. This suggests that the model is giving significant importance to the days leading up to and following January 10, which is consistent with the high-water level recorded on that day. Moreover, the different attention heads display varying patterns. While some heads (e.g., Head 3 and Head 7) have a more diffuse distribution of attention, others (e.g., Head 1 and Head 2) show more focused attention on specific dates. This variability indicates that the transformer model can differentiate the importance of input features over time and thus, captures a wide range of dependencies and relationships in the data. For example, certain heads might focus more on Aralaganwila rainfall data, while others give more weight to Angamedilla or Polonnaruwa Agri.
Discussion
Overall, the single encoder model showed better performance among all the transformer models. When there is an encoder as well as a decoder in a transformer model, it takes the Query (Q) and the Key (K) vectors from the encoder and the value (V) vector from the decoder to calculate the cross-attention between the two inputs. But when there is just the encoder, all the Q, K, and V are taken from a single input series. That might be the reason for its increased accuracy. However, if there are input series with two different sequence lengths, an encoder–decoder model has to be used for the analysis.
Furthermore, LSTM and transformer encoder models exhibited approximately similar performances in daily water level forecasting although many studies have illustrated a higher accuracy of transformers compared to LSTM (Castangia et al. 2023; Xu et al. 2023). There might be several common reasons for the issues observed, including the presence of outliers and high variance in the dataset (Granata et al. 2022). The box plots in Figure 2 highlight numerous outlier points within the dataset. Despite this, we chose not to apply any denoising techniques because some outliers may reflect the climatic patterns of the South Asian Tropical Zone.
Additionally, another reason for the suboptimal performance could be the lack of input features, as suggested by Wei et al. (2023). Our dataset included only four variables: data from three rain gauges and one river gauge. Incorporating other variables, such as evapotranspiration, temperature, and wind, could potentially enhance performance.
CONCLUSIONS AND RECOMMENDATIONS
In this study, four machine learning models (LSTM, transformer encoder and transformer encoder–decoder models 1 and 2) were applied to simulate daily water levels at the Manampitiya River gauge in the Mahaweli catchment. The impacts of different inputs on the output were also examined. According to the results, the use of past water level data denotes a higher impact on the output compared to the other input features such as rainfall. It is recommended to use many other input features that show higher correlation values with the target, along with the past observations of the output since this strategy increases the model performance as well as decreases the lag correlation. Further, LSTM and transformer encoder show similar accuracies in daily water level forecasting. Although transformer models tend to show a better performance in streamflow forecasting, lack of input features, and outliers and high variance of the dataset may cause the performance reduction for the transformer encoder model. However, in this case, switching the inputs between the encoder and decoder shows similar performance but it might be different in another case. Therefore, it is recommended to do this strategy and check the accuracy when there are input features with two different time steps.
For future research, we recommend examining how different input features such as rainfall, temperature, and humidity affect water level forecasting, considering the attention analysis. Specifically, leveraging the attention mechanism of transformer models can help identify the most influential input features, thereby optimizing the data used for training and improving prediction accuracy.
Additionally, exploring hybrid models that combine multiple algorithms, such as integrating LSTM with transformer models, could enhance forecasting accuracy and extend the prediction window. These hybrid models can take advantage of the strengths of each individual algorithm, potentially leading to more robust and reliable predictions. This approach is particularly valuable in addressing the high variability and complexity of hydrological data.
Ultimately, these advancements in model development and data utilization can significantly benefit water management practices. Improved forecasting accuracy supports sustainable decision-making, allowing for more effective mitigation strategies for droughts, floods, and other water-related challenges. This research has the potential to enhance disaster preparedness, optimize resource allocation, and contribute to the overall resilience of communities and infrastructure against climate-related impacts.
ACKNOWLEDGEMENTS
We would like to thank Ranga Rodrigo for giving valuable advice as well as learning resources regarding machine learning. Further, we are thankful to the Department of Irrigation, Sri Lanka for providing us with water level data.
AUTHOR CONTRIBUTIONS
G.W.T.I.M. led the development of research methodology, raw data acquisition, code development, data manipulation, training and evaluation of the models, results analysis and writing of the original draft. M.T.R.J. led the supervision throughout the study, providing financial support and facilitating data acquisition by signing the agreements. R.A.R. contributed to data analysis, code development, and final manuscript writing and provided guidance as well as financial support.
FUNDING SOURCES
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.