The utilisation of modelling tools in hydrology has been effective in predicting future floods by analysing historical rainfall and inflow data, due to the association between climate change and flood frequency. This study utilised a historical dataset of monthly inflow and rainfall for the Terengganu River in Malaysia, and it is renowned for its hydrological patterns that exhibit a high level of unpredictability. The evaluation of the predictive precision and effectiveness of the Optimised Decision Tree ODT model, along with the RF and GBT models, in this study involved analysing several indicators. These indicators included the correlation coefficient, mean absolute error, percentage of relative error, root mean square error, Nash-Sutcliffe efficiency, and accuracy rate. The research results indicated that the ODT and RF models performed better than the GBT model in predicting monthly inflows. The ODT model, as well as the RF and GBT models, showed validation results with average accuracies of 94%, 91%, and 92%, respectively. The R² values were 90.2%, 84.8%, and 96.0%, respectively, and the NES values ranged from 0.92 to 0.94. The results of this research have greater implications, extending beyond the forecasting of monthly inflow rates to encompass other hydro-meteorological variables that depend exclusively on historical input data.

  • Introducing advanced model – optimised decision tree (ODT) for precise monthly inflow prediction, leveraging 50 years of rainfall and inflow data.

  • ODT outperforms gradient boosting tree and random forest models.

  • ODT predicts inflow levels well based on historical rainfall, outperforming other advanced models.

AB

AdaBoost

ANFIS

artificial neuro-fuzzy interface system

ANN

artificial neural networks

BDTR

boosted decision tree regression

BLR

Bayesian linear regression

CNN

convolutional neural network

DFR

decision forest regression

DT

decision tree

GA

genetic algorithm

GBT

gradient boosting tree

GP

Gaussian process

KNN

k-nearest neighbours

LDA

linear discriminant analysis

LR

logistic regression

LSTM

long short-time memory

MAE

mean absolute error

MAPE

mean absolute percentage error

MLR

multiple linear regression

NB

naive Bayes

NN

neural network

NNR

neural network regression

ODT

optimised decision tree

PSO

particle swarm optimisation

R2

correlation coefficient

%RE

percentage of relative error

RF

random forest

RMSE

root mean square error

SVM

support vector machines

XGBoost

eXtreme gradient boosting

The river is considered to be among the most crucial water resources and is a key component of the global freshwater resource system, which serves various purposes, including the provision of drinking water, support for agricultural practices, and facilitation of industrial activity. However, throughout the prior century, floods have constituted around 40% of natural calamities, resulting in over 19% of overall casualties and affecting more than 48% of the total population impacted (Munawar et al. 2019; Zhang et al. 2021). Over the past few decades, there has been a significant rise in the intensity and magnitude of flood threats, primarily attributed to climate change and various anthropogenic factors (Bubeck & Thieken 2018; Wang et al. 2022c). Furthermore, large-scale floods present a significant threat to human life and result in tens of thousands of fatalities and substantial economic losses each year in regions susceptible to flood (Aerts et al. 2018; Ahmadalipour & Moradkhani 2019). Similarly, river floods are often described as having very high velocities that have prominent effects on people's lives and economies (Ahmadalipour & Moradkhani 2019). Flash floods are natural occurrences that happen when a large volume of water is released quickly over a brief period of time, typically during heavy rainfall lasting only a few minutes or hours. Likewise, the sudden failure of structures is another cause of flash flooding (Wang et al. 2019). Subsequently, developing a river inflow forecast technique is essential for flood control, early warning, and reservoir operation (Cheng et al. 2020; Kilinc & Yurtsever 2022). It also has a prominent role in mitigating the impacts of the deficit on water resource systems. Furthermore, accurate forecasting results in better control of water availability, life protection, improved hydropower generation, and reduced economic losses from early warnings (Kilinc & Yurtsever 2022). Consequently, it becomes pivotal to develop forecasting models for river inflow (Ibrahim et al. 2022). However, the accurate prediction of forthcoming floods and the identification of susceptible regions are complex undertakings that necessitate the utilisation of accurate geographical and temporal data, in conjunction with dependable predictive models (Boucher et al. 2020; Chu et al. 2021; Elbeltagi et al. 2022; Zahura & Goodall 2022; Jahangir et al. 2023; Wang & Zai 2023). In the past few years, there has been a focused endeavour to develop accurate prediction models for monthly flood forecasting. Several modelling techniques have been utilised for this purpose, with a special focus given to artificial intelligence models. These models are adept at extracting valuable information from extensive datasets, presenting it in clearly understandable formats, and preserving time and resources (Maier & Dandy 2000). Furthermore, machine learning (ML) techniques are frequently utilised in hydraulic and water structure planning due to their capacity to accurately forecast solutions in non-linear problems (Di Nunno et al. 2021; Elbeltagi et al. 2022; Granata & Di Nunno 2023; Ruma et al. 2023). Numerous studies have been conducted in multiple locations across the globe with the aim of understanding the susceptibility to flooding in areas that have been severely affected by such events. For instance, in India (Chowdhuri et al. 2020; Ramesh & Iqbal 2022), the United States of America (Giovannettone et al. 2020), Japan (Fan & Huang 2020), Bangladesh (Alam et al. 2021), Vietnam (Dang et al. 2024), Iran (Goodarzi et al. 2024), Romania (Zhen & Bărbulescu 2024), China (Chen et al. 2023; Dai et al. 2023; Zhu et al. 2023), South Korea (Lee et al. 2023), Morocco (Nifa et al. 2023), Pakistan (Khan et al. 2023), and Australia (Ahmed et al. 2021). ML approaches have frequently been utilised within the framework of two distinct modelling types: basic single modelling and hybrid combination modelling (Choubin et al. 2019). Over the period of recent years, there has been an upgrade from single-application modelling to hybrid ensemble modelling, which has been proven through experiments to yield stronger predictive models and the ability to make accurate forecasts about forthcoming occurrences (Tyralis et al. 2021; Granata et al. 2022; Ibrahim et al. 2022; Di Nunno et al. 2023; Ng et al. 2023). Numerous studies were conducted to forecast river flood rates based on different time series by using hybrid and single ensemble models, including various models. For instance, a streamflow prediction study of the Euphrates River suggested a hybrid method that combined a long short-time memory (LSTM) and a genetic algorithm (Kilinc & Haznedar 2022). In hydrology, a hybrid LSTM neural network (NN) and a lion optimiser model have been effectively used for monthly runoff forecasting. This model demonstrated a higher accuracy compared to existing models (Yuan et al. 2018) and long short-term memory (LSTM)-weighted mean of vectors optimizer (INFO) is used for water temperature prediction (Ikram et al. 2023). The Relevance Vector Machine tuned with Improved Manta-Ray foraging optimization (RVM-IMRFO) model demonstrated substantial enhancements in performance indicators such as root mean square error (RMSE), mean absolute error (MAE), correlation coefficient (R2), and Nash–Sutcliffe efficiency (NSE) compared to alternative tuning algorithms (Adnan et al. 2023b). This was achieved using random vector functional link based on quantum-based avian navigation optimizer algorithm (RVFL-QANA) with limited climatic data modelling to estimate potential evapotranspiration, as reported by Mostafa et al. (2023). In addition, suspended sediment load prediction in river systems was performed using the support vector machine (SVM)-FFAPSO model in 2024 by Katipoğlu et al. (2024). The proposed hybrid artificial neuro-fuzzy interface system (ANFIS-WCAMFO) model, using nine input combinations of meteorological datasets, suggests a promising technique due to its high predictive accuracy and low error in predicting monthly evapotranspiration (Adnan et al. 2021). In a 2023 study, it was discovered that the ELM-JSO significantly improved the RMSE of the standalone ELM model by 13% for the optimal inputs of temperature, precipitation, and groundwater level during the testing stage. The Extreme Learning Machine-Jellyfish Search Optimizer (ELM-JSO) had the highest performance in estimating the daily groundwater level, followed by the ELM optimized using whale optimization algorithm (WOA-ELM), ELM-particle swarm optimisation (PSO), and ELM optimized using Harris Hawks optimizer (ELM-HHO) (Adnan et al. 2023a). In addition, an artificial neural network (ANN) model has been employed by Mei et al. (2023) and Dang et al. (2024), predicting daily streamflow using multi-layer perceptron (MLP) (Mohammadi et al. 2020), ANFIS (Goodarzi et al. 2024), ANFIS-ABC (Pham et al. 2024), random forest (RF) (Zahura & Goodall 2022; Naganna et al. 2023), gradient boosting tree (GBT) as applied in Ni et al. (2020), SVM, as discussed by Essam et al. (2022) and Dang et al. (2024), along with logistic regression (LR) and frequency ratio, as used by Tehrany and Kumar (2018). In 2024, a study investigated multiple linear regression (MLR) and RF to predict monthly streamflow (Xu et al. 2024). Extreme Gradient BOOSTING (XGBoost or XGB) has demonstrated efficacy in addressing streamflow prediction (Goodarzi et al. 2024). Furthermore, in 2024, a study suggested hybrid methods such as convolutional neural network (CNN)-LSTM, Sparrow Search Algorithm With Backpropagation Neural Networks (SSA-BP), and ELM optimized using particle swarm optimization (PSO-ELM) for monthly water outflow prediction (Zhen & Bărbulescu 2024). Cai & Yu (2022) implemented a hybrid recurrent NN on flood forecasting. Moreover, Sahana et al. (2020) demonstrated that the SVM exhibited superior performance as a model for conducting flood assessments in the Sundarban biosphere reserve situated in India (Sahana et al. 2020). Pham et al. (2020) conducted an evaluation of a flood-affected area in Vietnam and reported a high level of accuracy achieved by the LR model (Pham et al. 2020). Moreover, Shada et al. (2022) have proposed a hourly flood forecasting using hybrid wavelet-SVM and Xu & Peng (2015) and Schulte (2017) have proposed novel frameworks for flood predictions. Eventually, ML techniques such as SVMs, LR, and ANNs have proven to be beneficial predictors. Yet, decision tree (DT)-based models continue to maintain their popularity. This assertion is supported by numerous studies that have utilised DT models and have consistently found them to be superior predictors (Chen et al. 2020; Tang et al. 2020; Pham et al. 2021, 2020). These frameworks employ many methodologies, including fuzzy clustering, K-means clustering, and NNs. Certainly, the intrinsic flexibility of ML models allows the development of improved and more efficient models for solving the challenges associated with the monitoring and control of flood in rivers. Furthermore, the ability to predict is further improved by the utilisation of optimisation and ensemble techniques, while incurring minimum additional costs in terms of time, memory, and compute. Consequently, additional types of DT-based models have emerged and are gradually gaining importance, one of which is the RF model (Chen et al. 2020) which utilises an ensemble of DTs. Another notable model is gradient boosted tree (GBT) (Naganna et al. 2023), which combines multiple weak DTs to create a stronger predictive model. Alternating DTs has also emerged as a distinct type of DT-based model (Janizadeh et al. 2019; Chen et al. 2020). Logistic model trees (Khosravi et al. 2018), naïve Bayes (NB) trees (Chen et al. 2020; Tang et al. 2020), and reduced error pruning trees are other noteworthy models in this category (Khosravi et al. 2018). Although the methods mentioned above have been extensively validated and proven to be effective models, there exists another innovative approach in the field of DTs called optimised decision tree (ODT). However, the utilisation of this method in flood prediction research has not been extensively utilised, making it challenging to confidently establish its efficacy as a reliable classifier. In relation to this issue, the current investigation has focused on developing and motivating a forecasting model to forecast monthly flood rates based on historical rainfall and inflow rates to obtain more accurate estimates and to upgrade the DT model to the ODT model. Moreover, the primary objective of optimising model parameters was to determine the most suitable model parameters from an extensive collection of hydrological data, with the objective of achieving precise forecasts for an extensive variety of flood-related occurrences, make more accurate predictions of flood risks than the single DT, and comparing the usefulness of three ensembles of ML. The authors propose that this model exhibits the capacity to be generalised and implemented in various rivers across the globe. The ODT model and the comparison models implemented in this study are robust statistical methods that can be utilised for classification, prediction, interpretation, and data manipulation. Additionally, the sold DT model has shown encouraging results in its ability to forecast river conditions by utilising several types of covariates, including water quality measurements and weather patterns. The inherent adaptability of the DT model is a highly important asset for academics and policymakers who are endeavouring to comprehensively comprehend and effectively govern river ecosystems at a worldwide level (Khosravi et al. 2018). Even when identifying the pattern behaviour of a complex dataset, the DT model showed remarkable performance (Everaert et al. 2016). Accordingly, the proposed ODT model has not previously been used for flood ensemble modelling and it will be used to estimate the monthly inflow rate by utilising rainfall and inflow databases for the Terengganu River in Malaysia along with GBT and RF.

The Terengganu River holds significant importance as a prominent river located within the state of Terengganu, situated in the north-eastern coastal region of Peninsular Malaysia. It is situated between a longitude of 102° 30′–103° 09′E and a latitude of 4° 40′–5° 20′N. The river originates from the upstream watershed of Kenyir Lake and crosses the state capital of Terengganu (Kuala Terengganu) before ultimately reaching the South China Sea as shown in Figure 1. In addition to Kenyir Lake, the Terengganu River receives supplies from several significant tributaries, namely the Nerus, Tersat, Berang, and Telemung Rivers. These streams collectively have a catchment area of approximately 5,000 km2. The rivers cross various socio-economic regions encompassing several industries, including plantations specialising in rubber, coconut, and palm oil cultivation; aquaculture; commercial and industrial zones; urban areas; nature reserves; and forests. The concentration of population density can be observed in the urban areas of Kuala Terengganu and Kuala Beran. Besides, the Terengganu basin area is generally hot and humid all year round, averaging temperatures ranging from 28 to 30°C in the daytime and an average of 22°C in the night-time, with an average humidity of 84.0%/month (Juneng et al. 2007). Terengganu experiences a substantial amount of rain during the peak of the wet season, often occurring from September to October, with an average monthly rainfall of approximately 400 mm. Simultaneously, the period from January to April, known as the dry season, experiences rainfall levels reaching an average of 190 mm. The research area experiences significant and intense rainfall during the wet seasons (The Department of Irrigation and Drainage of Malaysia).
Figure 1

Terengganu River basin Malaysia.

Figure 1

Terengganu River basin Malaysia.

Close modal
In recent years, Malaysia has reported several flooding incidents. For instance, an extreme rainfall event from October to December in 2004, 2014, and 2022 resulted in severe flooding along the east coast of Peninsular Malaysia, with Terengganu being the worst affected state, where the rainfall rate in the storm centre exceeded 600 mm in 2004 and exceeded 800 mm in 2014 during the same period, which led to heavy flash floods (Juneng et al. 2007). These occurrences primarily occurred between October and January as a result of the northeast monsoon cold snap-caused heavy rains. From December 2021 to January 2022, the Department of Irrigation and Drainage of Malaysia experienced some of the worst disasters in its history. As a consequence, approximately 50 people died and over 125,000 were displaced. In addition, the country's disaster management agency, National Disaster Management Agency Malaysia (NADMA), reported that approximately 6,000 families (20,000 people) were displaced in Terengganu states in 2022. Furthermore, according to historical records, numerous regions of the Terengganu River basin have experienced devastating flash flooding during wet seasons, resulting in significant loss of life and property. Subsequently, a model for forecasting flash floods is essential for risk management and mitigating the effects of flooding in this region and others. Moreover, the historical dataset was collected from over 75 manual and automatic monitoring stations along the Terengganu River basin and divided between 66 rainfall and 9 streamflow, temperature, and 5 humidity stations. The monitoring stations’ data utilised in this study were acquired from the Department of Irrigation and Drainage (DID) Malaysia. Furthermore, the collected dataset was based on two criteria: (I) no data error or missing years and (II) contain rainfall and streamflow data at or more than 30 years (the rainfall records from 1973 to 2020). Consequently, the completed and most explicit dataset was used for a period of 20 years, the objective being to acquire a well-balanced dataset consisting of 3,120 monthly rainfall records and 2,160 monthly inflow records as shown in Figures 2.
Figure 2

Terengganu River basin monthly flow rate 2000 to 2020.

Figure 2

Terengganu River basin monthly flow rate 2000 to 2020.

Close modal
The Thiessen polygon method is used in this study to calculate the area of influence around each point based on its proximity to other points. Additionally, it helps in determining the amount of rainfall contribution from each station and calculating the average rainfall for each station. The average rainfall for all stations is then calculated using the following equation identified by Lal & Al-Mashidani (1978) to determine the average monthly rainfall:
where P1 represents the monthly rainfall values at the first station, while A1 represents the corresponding area of the Thiessen polygon for that station. Similarly, Pn represents the monthly rainfall values at the nth station and An represents the corresponding area of the Thiessen polygon for that station.

Forecasting models

The forecasting model employs three distinct ML algorithms: the DT, RF algorithm, and GBT.

The DT algorithm is highly regarded for its ability to handle both classification and regression tasks effectively, making it a valuable statistical tool for predictive modelling and classification purposes due to its ability to represent complex relationships in data (Everaert et al. 2016). This approach is commonly used to assess various consequences, including decision-making processes, event outcome probabilities, and investment risk evaluations, showcasing its versatility and applicability (Ho et al. 2019). The DT method for decision analysis involves utilising a tree-like structure to represent decisions and their potential outcomes, as described by Hu et al. (2016). A DT serves as a visual representation similar to a flowchart, used to systematically build classification or regression models in a structured manner, aiding in understanding complex relationships within the data. This structure entails a process of iteratively dividing the data into subsets based on defined criteria, allowing for the systematic organisation and analysis of information within the DT model. A DT is composed of internal nodes, branches, and leaves. The internal nodes are responsible for evaluating the value of a certain attribute or feature. As a result, each internal node corresponds to a specific attribute or feature and branches out into leaves representing the possible values or outcomes associated with that feature. Edges and branches in a DT represent the outcomes of tests or decisions, connecting to subsequent end nodes, which are also referred to as leaf nodes and are responsible for predicting the final outcomes. Leaf nodes play a crucial role in predicting the final outcomes of target values in DTs, representing class labels or distributions that contribute to the tree's hierarchical structure, resembling a tree-like shape. DTs demonstrate versatility by effectively managing both categorical and numerical data, making them suitable for a wide range of prediction tasks across different data formats. As a result, DT has been used in numerous streamflow prediction events, demonstrating promising results, outstanding capacity to apply to different cases, and excellent performance. These situations include flood forecasting (Dang et al. 2024), monthly streamflow (Wang et al. 2022a), and weekly forecast precipitation (Khairudin et al. 2020). A DT is an algorithm that starts at the top and makes decisions by splitting the data into smaller subsets, which may lead to suboptimal solutions. Furthermore, a study conducted in 2017 demonstrated that ODTs have become feasible. It revealed that optimal DTs can provide out-of-sample accuracy ratings that are 1–5% higher than those achieved by earlier heuristics such as the much-used Classification And Regression Trees (CART) algorithm (Bertsimas & Dunn 2017). This paper suggests a method to find the best values for the variables in a DT using optimise parameters (Grid). Optimise parameters (Grid) is a tool that adjusts the settings of an ML model automatically by searching through a grid of values for optimal performance. Hyperparameters are settings of a model that must be decided before training and cannot be learned from the data directly. When an ML model is created, it often comes with several hyperparameters that control its behaviour and performance, such as the maximum size of the tree, the minimum number of instances required in a node for inducing a split, the node splitting criterion, and the amount of pruning. The model's performance can change greatly depending on the chosen hyperparameter values. Finding the best hyperparameter values to improve the model's performance is called hyperparameter tuning or optimisation. The ‘optimise parameters (Grid)’ tool in RapidMiner lets the researchers set ranges for hyperparameter values and search through all possible combinations to find the best settings. Each combination is used to train the model with the training data and assess its performance using metrics such as accuracy, F1-score, R2, RMSE, MAE, percentage of relative error (%RE) or Receiver-Operating Characteristic Curve Area Under The Curve (ROC-AUC) on a separate validation dataset.

This study implemented the ODT method and compared its performance with that of the GBT, RF, and sold DT algorithms. The results indicate that the ODT exhibits better accuracy and offer high confidence levels in solving prediction challenges. In recent times, significant efforts have been made to enhance the efficiency of the ODT method, resulting in its widespread implementation for the purpose of predicting river future inflow. The RF is a type of supervised learning technique that is utilised for both regression and classification applications (Sahour et al. 2021). Furthermore, RF combines and utilises the weakly ensemble classifiers DT to form a more robust classifier (Cutler 2010; Goldstein et al. 2011).

The RF technique produces a collection of trees that collectively form a more compact DT, and the ultimate categorisation is determined by allowing these trees to choose the most popular classes (Cutler 2010; Hou et al. 2021). At each node, the features of each created tree are selected randomly, whereas a conventional CART DT utilises all available information. As a result, the RF algorithm ensures the presence of random characteristics. During the process of randomly picking features, the input training data used to create each tree are derived from a complete training set. This is achieved by randomly selecting a specified number of training samples with replacements. This methodology guarantees the randomisation of the training samples. Subsequently, a tree is cultivated on the novel training set through the utilisation of random feature selection. Every tree reaches its optimum growth potential without undergoing any pruning. The utilisation of dual randomness in random forests serves to mitigate the issue of overfitting, hence improving both the accuracy and generalisation capacity of the model. Furthermore, the RF is widely utilised in various fields, including hydrology, due to its high accuracy, capability with large datasets, and robustness against noisy data (Schoppa et al. 2020). Moreover, researchers have implemented RF in numerous streamflow prediction events, demonstrating encouraging results and remarkable adaptability to diverse scenarios. Such circumstances pertain to flood prediction studies in Vietnam (Dang et al. 2024), others in India for forecasting streamflow on a one- and three-day lead daily (Naganna et al. 2023), forecasting streamflow one, two, and three days ahead in China (Wang et al. 2022b), streamflow forecasting up to 7 days in the UK (Di Nunno et al. 2023), predicting monthly streamflow in China (Xu et al. 2024), daily streamflow forecasting in China (Shen et al. 2022), and forecasting monthly streamflow in the USA (Wang et al. 2022a). When predicting streamflow, it is important to note that the hybrid RF model consistently outperforms the standalone RF model. Forecasting streamflow for 2, 3, and 4 months in 2021 involves a hybrid model (RF-MLR) that delivers more precise predictions than the standalone model (Abbasi et al. 2021).

The GBT model, introduced by Friedman (2001), is a combination of regression and classification tree models, and is specifically created by generating a sequence of DTs. The gradient boosting algorithm utilises the bootstrapping technique to successively include additional regression trees into the model without altering the structure of the model parameters. This is done in order to minimise losses or errors (Naganna et al. 2020; Hasan et al. 2023). Enhancing trees has been found to improve accuracy but can also reduce speed and human interpretability. Initially, the algorithm is set up with a fixed value. Subsequently, the pseudo-residuals are computed and then fitted in the base learner, which is a regression tree, after being scaled. Ultimately, the multiplier is determined by solving an optimisation challenge to modify the reduction function and forecast the outcome of the GBTB with a model consisting of multiple trees. Moreover, the fundamental gradient boosting approach is adjusted based on regularisation and limitations on the trees. In order to mitigate overfitting and improve the sensitivity of GBT to uncertainty, it utilises a technique known as shrinkage (Biau et al. 2019). Furthermore, the GBTB has been used in the long and short streamflow forecasting fields and has shown a promised outcome, such as forecasts of daily streamflow (Naganna et al. 2023), and monthly streamflow forecasting in the USA (Wang et al. 2022a).

Methodology

The data-processing procedures of the three models are illustrated in Figure 3 using a flowchart. In order to evaluate and compare the performance of the ODT model, GBT and RF models were mainly implemented. The procedures consist of three main phases: phase I, data collection; phase II, data pre-processing as mentioned; and phase III, predictive model development.
Figure 3

Modelling steps.

Figure 3

Modelling steps.

Close modal

The data collection phase began with collecting the historical dataset of monthly rainfall, temperature, humidity, and inflow rates (5,280 recorders) from observation and meteorological stations along the river basin of the Terengganu River from 1990 to 2020. Phase II: data pre-processing involves the preparation and classification of a dataset in order to improve the quality of the input data and make it suitable for subsequent phases. Furthermore, during the second phase, the model will undergo training utilising the initial dataset along with the test results that have been obtained. In Phase III, the prediction model is constructed by methodically structuring the entire dataset to make it usable for the model. Moreover, in Phase III the RapidMiner software has been used to construct the three models and to implement the optimise parameters (Grid) algorithm. Eventually, the performance is evaluated using statistical measures such as R2, RMSE, MAE, %RE, and NSE, along with accuracy derived from the validation process, and then the results of the ODT models will be compared with the outcomes of the GBT and RF models for validation purposes.

Model configuration

In this study, predictive modelling for estimating the monthly inflow rate has been investigated based on historical monthly rainfall, temperature, humidity, and river inflow rates of the Terengganu River in Malaysia. The monthly rate of rainfall temperature, humidity, and inflow for all the selected stations were categorised before being fed into the model. Therefore, the goal of categorising and preparing the data is to achieve results within the expected range while maintaining the original data distribution. It is crucial to ensure that the modelling process is not impacted by overfitting. Furthermore, to mitigate overfitting, the approach of partitioning the dataset into two distinct groups was selected due to its simplicity and resilience. Subsequently, the individual station data, along with the corresponding rainfall data, are integrated into the proposed ODT, RF, and GBT models for the purpose of training and validating the model's performance. In addition to the approach utilised in the present study, many additional forms of cross-validation exist, such as the hold-out method and the leave-one-out method (Arlot & Celisse 2010). As with all supervised model development, data must be separated into two groups: ‘training’ and ‘validation.’ Usually, most of the original data, ranging from 70 to 90%, is used for training, and the rest is reserved for testing. In this study, datasets containing 5,280 records from 2000 to 2020 were utilised to train and validate the models. The diagram in Figure 3 illustrates the different stages involved in predictive data mining modelling.

Model performance and evaluation

There are numerous matrices to evaluate the performance and accuracy of the ML algorithms. Consequently, in order to mathematically quantify the predictive performances of ODT, GBT, and RF, several statistical measures have been utilised to evaluate the performance for inflow forecasting; those measurements are R2, RMSE, the MAE, %RE, and NSE taken along with accuracy for the performance assessment of the developed model and comparison models. Additionally, as the accuracy alone does not provide adequate details about the inflow rate, relying only upon the accuracy rate may not be the proper method. Accordingly, various measurements have been implemented to evaluate the performance, as given in Equations (1)–(6).

Performance indicators to evaluate the model are as follows:
(1)
(2)
(3)
(4)
(5)
(6)
where is the mean of actual discharges, If is the actual value, I0 is the forecast value, and n is the number of samples.

RMSE measures the average magnitude of the errors between predicted and actual values, providing a way to assess the model's accuracy. When RMSE values are lower, the model is more accurate in predicting outcomes. Both RMSE and MAE are employed to quantify the disparities between predicted and actual inflow values. Moreover, a high positive R2 value, known as the coefficient of determination, indicates strong model performance. In addition, the model is accurate when the NSE values are close to 1. Table 1 shows general performance ratings.

Table 1

General performance ratings

R2Performance RatingNSEPerformance RatingMAEPerformance Rating
0.75 < R2 ≤ 1 Very good 0.75 < NSE ≤ 1 Very good MER < 10 Highly accurate 
0.65 < R2 ≤ 0.75 Good 0.65 < NSE ≤ 0.75 Good 11 < MER ≤ 20 Good 
0.5 < R2 ≤ 0.65 Satisfactory 0.5 < NSE ≤ 0.65 Satisfactory 21 < MER ≤ 50 Reasonable 
R2 ≤ 0.5 Unsatisfactory NSE ≤ 0.5 Unsatisfactory 51 + Inaccurate 
R2Performance RatingNSEPerformance RatingMAEPerformance Rating
0.75 < R2 ≤ 1 Very good 0.75 < NSE ≤ 1 Very good MER < 10 Highly accurate 
0.65 < R2 ≤ 0.75 Good 0.65 < NSE ≤ 0.75 Good 11 < MER ≤ 20 Good 
0.5 < R2 ≤ 0.65 Satisfactory 0.5 < NSE ≤ 0.65 Satisfactory 21 < MER ≤ 50 Reasonable 
R2 ≤ 0.5 Unsatisfactory NSE ≤ 0.5 Unsatisfactory 51 + Inaccurate 

Three different regression models, including ODT which focuses on creating a tree-like model for decision-making, GBT which builds trees sequentially to correct errors of the previous models, and RF models known for their ensemble learning approach, were used to evaluate the river inflow for each station in the Terengganu River basin. Forecasting the inflow for each station helps prevent the model from memorising the data (overfitting) and enhances the model's performance by providing insights into future trends and patterns. Moreover, the absence of a standard rule for dividing the data is crucial because it allows flexibility in adapting the data partitioning strategy to the specific characteristics of the dataset and the modelling objectives. Therefore, to build an optimised data-driven prediction model, it is crucial to meticulously select the most appropriate data partitioning strategy during model development and evaluation to ensure the model's accuracy and generalisability. Before developing the prediction model, several researchers utilised varying data portions for testing and training sets to explore how different data splits impact the model's performance and to assess the model's robustness across diverse datasets. In a study by Ridwan et al. (2021), four regression models, including Bayesian linear regression (BLR), decision forest regression (DFR), boosted decision tree regression (BDTR), and neural network regression (NNR) were used to predict the rainfall rate in Tasik Kenyir, Terengganu (Ridwan et al. 2021). The data were divided into 80–90% for training and 20–10% for testing to ensure that the model is trained on a substantial portion of the data while also having a significant portion reserved for evaluating the model's generalisation ability.

To determine the optimal data partitioning strategy, this study divided the data partition into three approaches. In the first approach, 75% of the dataset is used for training, while the rest is allocated for validation. In contrast, approaches #2 and #3 involve different ratios of training (80–90%) and testing (20–10%). After testing all models with different dataset split rates, it was evident that the performance significantly improved when using the #1 approach, which maintained superiority with a 75% training and 25% validation ratio. Furthermore, the #1 approach offers the highest accuracy rate and is the most suitable for all three models. Even though approaches #2 and #3 utilise more data in the training phase than approach #1, the current models provide lower accuracy utilising those approaches (#2–#3). Additionally, increasing the training dataset portion may lead to counterproductive outcomes. Therefore, selecting the optimal ratio for the training-to-testing dataset is essential to achieve the most accurate simulation model. Figure 4 indicates the distribution of the river inflow dataset during the training and testing periods using three distinct training approaches with varying dataset split ratios.
Figure 4

Data splitting during training and testing phases in the three different training approaches.

Figure 4

Data splitting during training and testing phases in the three different training approaches.

Close modal

The three models designed for river inflow forecasting were developed for a fair comparison. The proposed model's functionality was assessed by analysing its ability to accurately predict river inflow values based on historical data. This assessment included a comparison of its performance metrics, such as accuracy, precision, and reliability. This section explores a detailed analysis of how well the model performs, including how accurate and reliable it is. Additionally, it assesses how the chosen training method affects the accuracy and reliability of the predicted results. To thoroughly study the proposed modelling technique, the methods' reliability was assessed with various statistical indices, including RMSE, MAE, and R2, during both the model training and testing phases as listed in Table 2.

Table 2

Comparing of the performance indicators (ACC, MAE, RMSE, R2, NSE, and %RE) values using ODT, RF and GBT methods

ModelsStationsMeasurement function results
NumberACC %RMSEMAER2%RE%NSE%MAPE
ODT 98 5.358 3.099 0.960 0.1 0.961 8% 
98 4.267 2.867 0.943 0.0 0.944 10.6% 
96.50 1.0 0.436 0.960 0.0 0.960 13.8% 
94 25.267 12.425 0.951 40.9 0.960 18% 
94.12 4.949 2.171 0.910 0.1 0.911 10% 
78 42.764 25.319 0.840 0.1 0.891 9.% 
95.8 7.133 4.629 0.984 0.1 0.984 9.6% 
92.85 0.651 0.411 0.759 0.3 0.755 11.6% 
94.48 31.455 25.829 0.815 0.1 0.874 22% 
GBT 92.55 12.0 7.987 0.869 45 0.803 34% 
94.44 13.506 6.698 0.811 0.0 0.675 33.5% 
91.61 2.962 1.645 0.868 0.6 0.71 35.6% 
95.10 73.99 41.69 0.861 0.1 0.779 34% 
94.44 8.478 5.179 0.877 0.0 0.738 43% 
83.82 70.7 27.966 0.679 0.1 0.566 19.9% 
93.06 27.5 6.116 0.904 0.1 0.764 34.5% 
91.61 0.669 0.406 0.898 0.3 0.744 34% 
93.10 0.915 13.821 0.873 6.8 0.68 45% 
RF 92.02 5.833 3.788 0.969 24.6 0.954 14% 
95.83 5.245 3.271 0.964 11.7 0.951 17.8% 
96.50 1.100 0.634 0.976 1.9 0.96 16% 
93.01 30.9 18.303 0.960 165.8 0.947 15% 
93.06 3.386 2.039 0.969 14.9 0.958 16% 
70.43 30.9 18.3 0.923 114 0.841 9.9% 
95.83 10.256 6.116 0.974 22.7 0.967 17.8% 
90.85 0.298 0.162 0.959 1.26 0.949 20% 
94.48 21.339 13.28 0.951 22.3 0.942 22% 
ModelsStationsMeasurement function results
NumberACC %RMSEMAER2%RE%NSE%MAPE
ODT 98 5.358 3.099 0.960 0.1 0.961 8% 
98 4.267 2.867 0.943 0.0 0.944 10.6% 
96.50 1.0 0.436 0.960 0.0 0.960 13.8% 
94 25.267 12.425 0.951 40.9 0.960 18% 
94.12 4.949 2.171 0.910 0.1 0.911 10% 
78 42.764 25.319 0.840 0.1 0.891 9.% 
95.8 7.133 4.629 0.984 0.1 0.984 9.6% 
92.85 0.651 0.411 0.759 0.3 0.755 11.6% 
94.48 31.455 25.829 0.815 0.1 0.874 22% 
GBT 92.55 12.0 7.987 0.869 45 0.803 34% 
94.44 13.506 6.698 0.811 0.0 0.675 33.5% 
91.61 2.962 1.645 0.868 0.6 0.71 35.6% 
95.10 73.99 41.69 0.861 0.1 0.779 34% 
94.44 8.478 5.179 0.877 0.0 0.738 43% 
83.82 70.7 27.966 0.679 0.1 0.566 19.9% 
93.06 27.5 6.116 0.904 0.1 0.764 34.5% 
91.61 0.669 0.406 0.898 0.3 0.744 34% 
93.10 0.915 13.821 0.873 6.8 0.68 45% 
RF 92.02 5.833 3.788 0.969 24.6 0.954 14% 
95.83 5.245 3.271 0.964 11.7 0.951 17.8% 
96.50 1.100 0.634 0.976 1.9 0.96 16% 
93.01 30.9 18.303 0.960 165.8 0.947 15% 
93.06 3.386 2.039 0.969 14.9 0.958 16% 
70.43 30.9 18.3 0.923 114 0.841 9.9% 
95.83 10.256 6.116 0.974 22.7 0.967 17.8% 
90.85 0.298 0.162 0.959 1.26 0.949 20% 
94.48 21.339 13.28 0.951 22.3 0.942 22% 

The RF model had a substantially greater level of efficacy in terms of the R2 rate, averaging at 0.96%, in comparison to the ODT and GBT models, which achieved rates of 0.90 and 0.84%, respectively. In addition, the ODT model has an average relative error (RE%) of 0.4%, indicating little difference between the actual and predicted values. This is in contrast to the RF model, which has a 42.1% error rate, deemed considerable and potentially unacceptable in certain situations. The GBT model exhibited a RE% that was 5.48% greater than that of the ODT model. When comparing the models, the GBT model shows slightly lower predictive performance compared to the RF and ODT models. Nevertheless, the disparity is minor, so the two approaches may produce almost indistinguishable outcomes. The previous section discussed the application of Equations (1)–(6) to the ODT, GBT, and RF models for the nine streamflow stations on the Terengganu River. These models were generated using the historical rainfall and streamflow dataset from 2000 to 2020 and the trained model with training approach #1. The results of this analysis are presented in Table 2. The results unequivocally demonstrate that the ODT model outperforms the GBT and RF models in terms of accuracy (ACC), boasting an impressive average of 94% compared to 92 and 91% for the GBT and RF models, respectively. Furthermore, the findings indicated that 94% of the 2160 river inflow recorded datasets were accurately forecasted, resulting in a trained model that is 94% accurate with a minimal error rate of only 6%. Moreover, the present research system exhibits superior outcomes in comparison to past research methodologies and studies with comparable aims. In their study in 2021, Munawar et al. (2021) employed a CNN to accurately map floods. They achieved an impressive average accuracy rate of 84%. Lopez-Fuentes et al. (2017) conducted a study in 2017 using CNN models to predict floods and landslides. The study achieved an average accuracy rate of 83.96%. Additionally, flood mapping was performed with an accuracy of 87.5% using texture features and RF with RGB images (Feng et al. 2015). Elkhrachy (2015) employed the analytical hierarchy process to ascertain the relative influence weights of flood-causing elements, achieving an accuracy rate of 84.4% (Elkhrachy 2015). In 2015, a study was conducted to create a flood susceptibility mapping system that was combined with GIS. The study utilised different types of kernels and SVM classifiers, resulting in an accuracy rate of 84.97% (Tehrany et al. 2015). Lee et al. (2017) reported that the RF model achieved an accuracy of 78.78% for the regression technique and 79.18% for the classification procedure. The boosted tree model achieved a validation accuracy of 77.55% for regression and 77.26% for classification (Lee et al. 2017). As reported by Ridwan et al. (2021), the study employed multiple models to forecast rainfall in Tasik Kenyir, Terengganu, revealing a range of rainfall prediction coefficients from 0.5 to 0.9. The study reported the highest values for daily (0.97), weekly (0.98), 10-day (0.98), and monthly (0.99) rainfall predictions. DFR models have achieved 0.094 of MAE and the RMSE was 0.156. The BDTR model exhibited an MAE of 0.064 and an RMSE of 0.117. The NNR model attained an MAE of 0.389 and an RMSE of 0.672. The BLR model exhibited an MAE of 0.417 and an RMSE of 0.674 (Ridwan et al. 2021).

In addition, a study conducted in 2024 utilised RF and MLP models, with climatic data as input, to forecast monthly streamflow. The RF model had significant efficacy in capturing the variability of low flow, as indicated by R2, NSE, and RMSE values of 0.90, 0.89, and 4.53, respectively. In contrast, the MLR model exhibited somewhat lower predictive accuracy than the RF model, as indicated by R2, NSE, and RMSE values of 0.63, 0.60, and 8.02, respectively. Both models show a high level of efficacy in integrating all climate factors. Nevertheless, the unrefined models slightly outperformed the trained model, indicating overfitting and validating the reasonableness of the variables (Xu et al. 2024).

In this investigation, the analysis revealed that the ODT model, along with the GBT and RF models, demonstrated NSE values of 0.92, 0.718, and 0.94, respectively. The ODT and RF models exhibited exceptional outcomes, as evidenced by the NSE values falling between 0.92 and 0.94, suggesting a high degree of performance. In addition, the GBT model had an NSE value of 0.718, which is within the acceptable performance range (0.65 < NSE ≤ 0.75), suggesting a satisfactory degree of performance. Furthermore, it is important to mention that the MAE values for these models were 8.58, 12.4, and 7.3, respectively. The data listed in Table 2 show that both the ODT and RF models exhibit a significant level of accuracy, as evidenced by their MAE values. On the other hand, the GBT model has an accuracy that falls between 11 and 20 for the MAE. According to the evaluation criteria in Table 1, all objective functions achieved a level of performance ranging from exceptional to good, indicating an overall satisfactory performance. Table 2 also presents the evaluation findings of the mean absolute percentage error (MAPE) for the ODT, RF, and GBT models. MAPE evaluates the average percentage error of a model, providing a measure of the average accuracy of its predictions. The maximum obtained MAPE value was 35.6%. The MAPE values are classified into the following categories: values below 10% are considered great; values between 10 and 20% are considered good; values between 20 and 50% are considered acceptable; and values above 50% are considered inaccurate (Moreno et al. 2013; Shrestha et al. 2021). Three stations have estimated MAPE values that fall below 10%, indicating that they are excellent. The remaining stations have values below 20%, indicating that they are good. The RF model values are categorised between great and good, while the GBT model demonstrated an acceptable range of MAPE values.

Figure 5(a)–5(d) presents a comprehensive analysis of the river streamflow data from the first station. This analysis compares the actual streamflow with the predicted streamflow using three models (ODT, GBT, and RF) over a period of 20 years. According to the diagrams, the water level of the river reached its highest level and overflowed over by the end of 2014. In Terengganu, the majority of rivers exceeded certain thresholds and became dangerous. On December 20, 2014, the Hydrology Division Flood Operations Room of the Terengganu DID reported that the water level at Sungai Dungun in Kuala Jengai exceeded the danger level by 0.6 m, measuring at 21.6 m (compared to the danger level of 21 m). Similarly, Sungai Dungun at Jerangau Bridge recorded a water level of 13.42 m, surpassing the danger level by 0.92 m (compared to the danger level of 12.5 m). Figure 5(a) illustrates the use of the ODT model to compare anticipated values with actual values, highlighting its predictive accuracy. Furthermore, the diagram clearly demonstrates a substantial and noteworthy association between the predicted and actual values. Figure 5(b) displays the GBT model plot, which demonstrates a partial relationship between the actual inflow data and predicted values. The degree of insufficient matching seen in December 2014, especially in relation to the dangerous level of river inflow, was found to be insufficient when compared to the ODT model. Table 2 displays the statistical performance indicators that offer corroborating evidence for this assertion. Figure 5(c) displays a graph illustrating the correlation between the predicted values and the actual values obtained from the RF model. The diagram clearly illustrates a distinct difference between the results obtained from the ODT model and the RF model values. Overall, there is a reliable correlation between the predicted and actual values, consequently validating the predictive capability of the ODT and RF models. Figure 5(d) displays a comparison between predicted results and actual values for all three models. Based on the statistical indicators presented in Table 2, the results of ODT and RF demonstrate a significant relationship with the actual values. The table demonstrates that, in most circumstances, the ODT and RF models closely resemble the actual values more than the GBT model, indicating their superior performance.
Figure 5

(a) Comparison of predicted and actual inflow using ODT. (b) Comparison of predicted and actual inflow using GBT. (c) Comparison of predicted and observed inflow using RF. (d) Plot of a comparison between the ODT, GBT, and RF simulated flows vs the actual monthly streamflow over a period of 20 years (2000–2020).

Figure 5

(a) Comparison of predicted and actual inflow using ODT. (b) Comparison of predicted and actual inflow using GBT. (c) Comparison of predicted and observed inflow using RF. (d) Plot of a comparison between the ODT, GBT, and RF simulated flows vs the actual monthly streamflow over a period of 20 years (2000–2020).

Close modal
Figure 6 illustrates the results of the scatter plot diagram, delineating data from all streamflow stations analysed in this study. The dataset was generated utilising three distinct predictive models: ODTs, RF, and GBT. A critical analysis of the scatter plot reveals that the ODT, GBT, and RF models yield remarkably consistent and comparable results across all river inflow stations. Table 2 provides the correlation indicators, essential for understanding the congruence between actual values and predicted values, as measured by the correlation coefficient. These coefficients, which range from −1.0 to 1.0, are instrumental in quantifying the degree of linear relationship between the observed and predicted data. Values outside this range indicate potential errors in measurement. Specifically, a correlation coefficient of −1.0 signifies a perfect negative correlation, while a coefficient of 1.0 denotes a perfect positive correlation. The scatter plot diagram indicates that the R2 values, reflecting the degree of agreement between the actual monthly inflow and the predicted inflow from the ODT, GBT, and RF models, reside in the range indicative of near-perfect positive correlation. The R2 values for the ODT, GBT, and RF models are 90.2, 84.8, and 96.0%, respectively, underscoring the models' robustness in predicting inflow patterns. A thorough examination of the consistency between predicted and actual inflow values during testing enables a detailed comparative analysis of the results. It is noteworthy that these models have demonstrated superior predictive performance, as evidenced by their R2 values, when compared to the models employed in preceding studies. This assertion is corroborated by the data presented in Table 2, highlighting the enhanced accuracy and reliability of the ODT, GBT, and RF models in forecasting river inflow.
Figure 6

The Taylor diagram results for the ODT model along with RF and GBT models.

Figure 6

The Taylor diagram results for the ODT model along with RF and GBT models.

Close modal

Recent studies have further corroborated the efficacy of ML models in streamflow forecasting. For instance, in 2024, Dang et al. investigated nine models for predicting floods. These were AdaBoost (AB), DT, Gaussian process (GP), K-nearest neighbours, linear discriminant analysis, NB, NN, RF, and SVM. Results showed three ML models, GP, RF, and NN, outperformed the remaining models, with R2 values of 0.997, 0.996, and 0.995, respectively (Dang et al. 2024). Similarly, a study conducted in 2024 utilised RF and MLP models, with climatic data as input, to forecast monthly streamflow, indicated by R2 values of 0.90 and 0.63%, respectively (Xu et al. 2024). Moreover, in their study, Li et al. (2023) employed MLR and RF models to forecast the monthly water deficit index. The performance of the MLR and RF models was excellent at all 44 sites. In general, the RF model exhibited superior performance compared to the MLR model, as indicated by a higher coefficient of determination (R2 > 0.8%) across 38 locations (Li et al. 2023). These studies support the claim that the models used in this study exhibit superior performance in terms of R2 values.

The implications of these findings are significant for hydrological modelling and water resource management. The high R2 values suggest that the ODT, GBT, and RF models are highly effective tools for predicting streamflow, which is critical for planning and managing water resources in various hydrological contexts. These models' predictive capabilities can inform decision-making processes, enhance the precision of hydrological forecasts, and ultimately contribute to more effective and sustainable water resource management practices. Figure 6 shows the Taylor diagram results for the ODT model along with RF and GBT models for all the stations.

Performance of the ODT, RF, and GBT models on streamflow prediction

In this study, two out of the three models – ODT and RF – demonstrated excellent to good performance in monthly streamflow prediction, with RF outperforming ODT in certain cases. Additionally, the GBT exhibited a satisfactory level of performance. These findings support Yan et al.'s (2022) conclusion that RF outperformed other models in streamflow prediction, highlighting the importance of conducting comparative analyses of model performance (Yan et al. 2022). As suggested by Zhang et al. (2018), the superior performance of RF can be attributed to its capability to understand and model complex relationships between input variables and streamflow (Zhang et al. 2018). Conversely, GBT assumes a linear input–output relationship, which can limit its accuracy in modelling complex systems such as streamflow due to the potential oversimplification of relationships. ODT and RF, however, excel in identifying and utilising non-linear interactions between input variables, thereby enhancing the accuracy of their predictions (Yang et al. 2017). Tyralis et al. (2019) provided a concise analysis that underscored RF's strong predictive abilities in addressing complex hydrological issues, including rainfall–runoff forecasting, streamflow predictions, and groundwater modelling (Tyralis et al. 2019). Additionally, ODT and RF are noted for their superior performance in predicting both low and high flows compared to GBT. This advantage is attributed to their flexible non-linear fitting capabilities, which allow for more accurate and adaptable predictions across different sections of data.

Study limitations and uncertainty analysis

Although this study provides essential insights, it is important to recognise that there are several limitations. Initially, a variety of meteorological factors that affect streamflow were taken into account. However, several factors, such as temperature and humidity, were eliminated from the analysis because it was determined that they had a minimal impact on the results. In addition, the study did not analyse key climate change-related factors such as glaciers and permafrost due to a lack of available data. For example, rising temperatures can cause the breakdown of permafrost, which has the potential to change the way groundwater and surface water interact, as well as affect soil moisture and streamflow dynamics. Furthermore, the 50-year duration of meteorological data might not sufficiently encompass the long-term patterns and cyclic properties of specific factors, perhaps leading to an underestimation of the relationships between rainfall rates and streamflow. By including more extensive datasets, the researchers can improve the strength and reliability of their findings. While the researchers successfully used historical rainfall and inflow rates and excluded other factors, such as temperature and humidity, to forecast streamflow, it is important to acknowledge that these parameters alone may not completely explain the fluctuations in streamflow. Streamflow is likely to be influenced by additional elements and their intricate relationships. Therefore, it is recommended that future studies utilise hydrological models that integrate physical mechanisms in order to further examine the impacts of climatic conditions on streamflow. Moreover, although it is assumed that human-induced effects on the study area are minimal, disregarding their interactions with climatic factors, such as changes in precipitation patterns caused by human-made aerosol emissions (Jiang et al. 2023), could introduce some level of bias or uncertainty into our evaluations.

This study assessed streamflow forecasting using an ODT alongside RF and GBT models based on a historical monthly rainfall and inflow database spanning from 1990 to 2020 for the Terengganu River basin. Rainfall is a well-established contributing factor to natural disasters, such as the significant flood in Terengganu in December 2014. The results indicated that the ODT and RF models, in comparison to the GBT model, provided reasonably accurate predictions of inflow rates based on historical rainfall and inflow data, with R2 values of 90.2, 84.8, and 96.0%, respectively. The MAE values for these models were 8.58, 12.4, and 7.3, respectively, and the NSE values fell between 0.92 and 0.94.

ODT and RF methods are recommended for inflow prediction and can be utilised for future flood-related references, not only due to their robust prediction capabilities over GBT and previous methods but also because of their transparent model structures. This transparency allows flood management authorities to monitor and customise inputs based on regional needs, as well as provide early warnings for potential floods, thereby preserving lives and property. Future research could focus on employing ML models to address the non-linearity in streamflow models, particularly CNNs for image processing tasks and other advanced hybrid ML models.

The authors would like to express their gratitude to the Higher Institution Centre of Excellence (HICoE), Ministry of Higher Education (MOHE), Malaysia under the project code 2024001HICOE as referenced in JPT(BPKI)1000/016/018/34(5).

Osama A. Abozweita: Investigation, Methodology, Software, Formal analysis, Visualization, Writing- Original draft preparation

Ali Najah Ahmed: Conceptualization, Methodology, Formal analysis, Supervision, Writing - Review & Editing

Lariyah Bte Mohd Sidek: Methodology, Supervision, Validation, Writing - Review & Editing, Resources

Hidayah Bte Basri: Supervision, Methodology, Validation, Writing - Review & Editing, Resources

Mohd Hafiz Bin Zawawi: Methodology, Validation, Writing - Review & Editing, Resources

Yuk Feng Huang: Data Curation , Methodology, Validation, Writing - Review & Editing

Ahmed El-Shafie: Conceptualization, Methodology, Validation, Writing - Review & Editing

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Adnan
R. M.
,
Mostafa
R. R.
,
Islam
A. R. M. T.
,
Kisi
O.
,
Kuriqi
A.
&
Heddam
S.
(
2021
)
Estimating reference evapotranspiration using hybrid adaptive fuzzy inferencing coupled with heuristic algorithms
,
Computers and Electronics in Agriculture
,
191
,
106541
.
Adnan
R. M.
,
Dai
H.-L.
,
Mostafa
R. R.
,
Islam
A. R. M. T.
,
Kisi
O.
,
Heddam
S.
&
Zounemat-Kermani
M.
(
2023a
)
Modelling groundwater level fluctuations by ELM merged advanced metaheuristic algorithms using hydroclimatic data
,
Geocarto International
,
38
(
1
),
2158951
.
Adnan
R. M.
,
Mostafa
R. R.
,
Dai
H.-L.
,
Heddam
S.
,
Kuriqi
A.
&
Kisi
O.
(
2023b
)
Pan evaporation estimation by relevance vector machine tuned with new metaheuristic algorithms using limited climatic data
,
Engineering Applications of Computational Fluid Mechanics
,
17
(
1
),
2192258
.
Aerts
J. C. J. H.
,
Botzen
W. J.
,
Clarke
K. C.
,
Cutter
S. L.
,
Hall
J. W.
,
Merz
B.
&
Kunreuther
H.
(
2018
)
Integrating human behaviour dynamics into flood disaster risk assessment
,
Nature Climate Change
,
8
(
3
),
193
199
.
Alam
A.
,
Ahmed
B.
&
Sammonds
P.
(
2021
)
Flash flood susceptibility assessment using the parameters of drainage basin morphometry in SE Bangladesh
,
Quaternary International
,
575
,
295
307
.
Arlot
S.
&
Celisse
A.
(
2010
)
A survey of cross-validation procedures for model selection
,
Statistics Surveys
,
4
,
40
79
.
Bertsimas
D.
&
Dunn
J.
(
2017
)
Optimal classification trees
,
Machine Learning
,
106
,
1039
1082
.
Biau
G.
,
Cadre
B.
&
Rouvìère
L.
(
2019
)
Accelerated gradient boosting
,
Machine Learning
,
108
,
971
992
.
Boucher
M.
,
Quilty
J.
&
Adamowski
J.
(
2020
)
Data assimilation for streamflow forecasting using extreme learning machines and multilayer perceptrons
,
Water Resources Research
,
56
(
6
),
e2019WR026226
.
Chen
W.
,
Li
Y.
,
Xue
W.
,
Shahabi
H.
,
Li
S.
,
Hong
H.
,
Wang
X.
,
Bian
H.
,
Zhang
S.
&
Pradhan
B.
(
2020
)
Modeling flood susceptibility using
data-driven approaches of naïve Bayes tree, alternating decision tree, and random forest methods
,
Science of the Total Environment
,
701
,
134979
.
Cheng
M.
,
Fang
F.
,
Kinouchi
T.
,
Navon
I. M.
&
Pain
C. C.
(
2020
)
Long lead-time daily and monthly streamflow forecasting using machine learning methods
,
Journal of Hydrology
,
590
,
125376
.
Choubin
B.
,
Moradi
E.
,
Golshan
M.
,
Adamowski
J.
,
Sajedi-Hosseini
F.
&
Mosavi
A.
(
2019
)
An ensemble prediction of flood susceptibility using multivariate discriminant analysis, classification and regression trees, and support vector machines
,
Science of the Total Environment
,
651
,
2087
2096
.
Chu
H.
,
Wei
J.
,
Wu
W.
,
Jiang
Y.
,
Chu
Q.
&
Meng
X.
(
2021
)
A classification-based deep belief networks model framework for daily streamflow forecasting
,
Journal of Hydrology
,
595
,
125967
.
Cutler
A.
(
2010
)
Random Forests for Regression and Classification
.
Ovronnaz
:
Utah State University
.
Dai
Z.
,
Zhang
M.
,
Nedjah
N.
,
Xu
D.
&
Ye
F.
(
2023
)
A hydrological data prediction model based on LSTM with attention mechanism
,
Water
,
15
(
4
),
670
.
Dang
T. Q.
,
Tran
B. H.
,
Le
Q. N.
,
Dang
T. D.
,
Tanim
A. H.
,
Pham
Q. B.
&
Anh
D. T.
(
2024
)
Application of machine learning-based surrogate models for urban flood depth modeling in Ho Chi Minh City, Vietnam
,
Applied Soft Computing
,
150
,
111031
.
Di Nunno
F.
,
Granata
F.
,
Gargano
R.
&
de Marinis
G.
(
2021
)
Prediction of spring flows using nonlinear autoregressive exogenous (NARX) neural network models
,
Environmental Monitoring and Assessment
,
193
(
6
),
350
.
Elbeltagi
A.
,
Di Nunno
F.
,
Kushwaha
N. L.
,
De Marinis
G.
&
Granata
F.
(
2022
)
River flow rate prediction in the Des Moines watershed (Iowa, USA): A machine learning approach
,
Stochastic Environmental Research and Risk Assessment
,
36
(
11
),
3835
3855
.
Essam
Y.
,
Huang
Y. F.
,
Ng
J. L.
,
Birima
A. H.
,
Ahmed
A. N.
&
El-Shafie
A.
(
2022
)
Predicting streamflow in Peninsular Malaysia using support vector machine and deep learning algorithms
,
Scientific Reports
,
12
(
1
),
3883
.
Everaert
G.
,
Bennetsen
E.
&
Goethals
P. L. M.
(
2016
)
An applicability index for reliable and applicable decision trees in water quality modelling
,
Ecological Informatics
,
32
,
1
6
.
Friedman
J. H.
(
2001
)
Greedy function approximation: A gradient boosting machine
,
Annals of Statistics
,
29
(
5
),
1189
1232
.
Giovannettone
J.
,
Sangameswaran
S.
,
Maderia
C.
&
Batten
B.
(
2020
)
Spatial analysis of flood susceptibility throughout Currituck county, North Carolina
,
Journal of Hydrologic Engineering
,
25
(
8
),
5020021
.
Goldstein
B. A.
,
Polley
E. C.
&
Briggs
F. B. S.
(
2011
)
Random forests for genetic association studies
,
Statistical Applications in Genetics and Molecular Biology
,
10
(
1
).
Goodarzi
M. R.
,
Niazkar
M.
,
Barzkar
A.
&
Niknam
A. R. R.
(
2024
)
Assessment of machine learning models for short-term streamflow estimation: The case of Dez River in Iran
,
Sustainable Water Resources Management
,
10
(
1
),
33
.
Hasan
M. H.
,
Ahmed
A.
,
Nafee
K. M.
&
Hossen
M. A.
(
2023
)
Use of machine learning algorithms to assess flood susceptibility in the coastal area of Bangladesh
,
Ocean & Coastal Management
,
236
,
106503
.
Ho
J. Y.
,
Afan
H. A.
,
El-Shafie
A. H.
,
Koting
S. B.
,
Mohd
N. S.
,
Jaafar
W. Z. B.
&
El-Shafie
A.
(
2019
)
Towards a time and cost effective approach to water quality index class prediction
,
Journal of Hydrology
,
575
,
148
165
.
https://doi.org/10.1016/j.jhydrol.2019.05.016
.
Hou
J.
,
Zhou
N.
,
Chen
G.
,
Huang
M.
&
Bai
G.
(
2021
)
Rapid forecasting of urban flood inundation using multiple machine learning models
,
Natural Hazards
,
108
(
2
),
2335
2356
.
Hu
Z.
,
Yang
J.
,
Wang
S.
&
Yang
Q.
(
2016
) ‘
A hybrid modified DEA efficient evaluation method in electric power enterprises
’,
2016 3rd International conference on informative and cybernetics for computational social systems (ICCSS)
,
Jinzhou, China, 26–29 August 2016
.
IEEE
, pp.
283
287
.
Ibrahim
K. S. M. H.
,
Huang
Y. F.
,
Ahmed
A. N.
,
Koo
C. H.
&
El-Shafie
A.
(
2022
)
A review of the hybrid artificial intelligence and optimization modelling of hydrological streamflow forecasting
,
Alexandria Engineering Journal
,
61
(
1
),
279
303
.
Ikram
R. M. A.
,
Mostafa
R. R.
,
Chen
Z.
,
Parmar
K. S.
,
Kisi
O.
&
Zounemat-Kermani
M.
(
2023
)
Water temperature prediction using improved deep learning methods through reptile search algorithm and weighted mean of vectors optimizer
,
Journal of Marine Science and Engineering
,
11
(
2
),
259
.
Jahangir
M. S.
,
You
J.
&
Quilty
J.
(
2023
)
A quantile-based encoder-decoder framework for multi-step ahead runoff forecasting
,
Journal of Hydrology
,
619
,
129269
.
Janizadeh
S.
,
Avand
M.
,
Jaafari
A.
,
Phong
T. V.
,
Bayat
M.
,
Ahmadisharaf
E.
,
Prakash
I.
,
Pham
B.
&
Lee
S.
(
2019
)
Prediction success of machine learning methods for flash flood susceptibility mapping in the Tafresh watershed, Iran
,
Sustainability
,
11
(
19
),
5426
.
Jiang
J.
,
Zhou
T.
,
Qian
Y.
,
Li
C.
,
Song
F.
,
Li
H.
&
Chen
Z.
(
2023
)
Precipitation regime changes in High Mountain Asia driven by cleaner air
,
Nature
,
623
(
7987
),
544
549
.
Juneng
L.
,
Tangang
F. T.
&
Reason
C. J. C.
(
2007
)
Numerical case study of an extreme rainfall event during 9–11 December 2004 over the east coast of Peninsular Malaysia
,
Meteorology and Atmospheric Physics
,
98
,
81
98
.
Katipoğlu
O. M.
,
Aktürk
G.
,
Kılınç
H. Ç.
,
Terzioğlu
Z. Ö.
&
Keblouti
M.
(
2024
)
Suspended sediment load prediction in river systems via shuffled frog-leaping algorithm and neural network
,
Earth Science Informatics
,
17
,
1
27
.
Khairudin
N. B. M.
,
Mustapha
N. B.
,
Aris
T. N. B. M.
&
Zolkepli
M. B.
(
2020
) ‘
Comparison of machine learning models for rainfall forecasting
’,
2020 International conference on computer science and its application in agriculture (ICOSICA)
,
Bogor, Indonesia, 16-17 September 2020
IEEE
, pp.
1
5
.
Khan
S.
,
Khan
A. U.
,
Khan
M.
,
Khan
F. A.
,
Khan
S.
&
Khan
J.
(
2023
)
Intercomparison of SWAT and ANN techniques in simulating streamflows in the Astore Basin of the Upper Indus
,
Water Science & Technology
,
88
(
7
),
1847
1862
.
Khosravi
K.
,
Pham
B. T.
,
Chapi
K.
,
Shirzadi
A.
,
Shahabi
H.
,
Revhaug
I. …
&
Tien Bui
D.
(
2018
)
A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, Northern Iran
,
Science of the Total Environment
,
627
,
744
755
.
doi:10.1016/j.scitotenv.2018.01.266
.
Lee
S.
,
Kim
J.-C.
,
Jung
H.-S.
,
Lee
M. J.
&
Lee
S.
(
2017
)
Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea
,
Geomatics, Natural Hazards and Risk
,
8
(
2
),
1185
1203
.
Lee
D.
,
Shin
J.
,
Kim
T.
,
Lee
S.
,
Kim
D.
,
Park
Y.
&
Cha
Y.
(
2023
)
Hybrid model for daily streamflow and phosphorus load prediction
,
Water Science & Technology
,
88
(
4
),
975
990
.
Li
Y.
,
Wei
K.
,
Chen
K.
,
He
J.
,
Zhao
Y.
,
Yang
G.
&
Wang
L.
(
2023
)
Forecasting monthly water deficit based on multi-variable linear regression and random forest models
,
Water
,
15
(
6
),
1075
.
Lopez-Fuentes
L.
,
van de Weijer
J.
,
Bolanos
M.
&
Skinnemoen
H.
(
2017
)
Multi-modal deep learning approach for flood detection
,
MediaEval
,
17
,
13
15
.
Maier
H.
&
Dandy
G.
(
2000
)
Neural networks for the prediction and forecasting of water resources variables: A review of modelling issues and applications
,
Environmental Modelling and Software
,
15
,
101
124
.
doi:10.1016/S1364-8152(99)00007-9
.
Mohammadi
B.
,
Ahmadi
F.
,
Mehdizadeh
S.
,
Guan
Y.
,
Pham
Q. B.
,
Linh
N. T. T.
&
Tri
D. Q.
(
2020
)
Developing novel robust models to improve the accuracy of daily streamflow modeling
,
Water Resources Management
,
34
,
3387
3409
.
Moreno
J. J. M.
,
Pol
A. P.
,
Abad
A. S.
&
Blasco
B. C.
(
2013
)
Using the R-MAPE index as a resistant measure of forecast accuracy
,
Psicothema
,
25
(
4
),
500
506
.
Mostafa
R. R.
,
Kisi
O.
,
Adnan
R. M.
,
Sadeghifar
T.
&
Kuriqi
A.
(
2023
)
Modeling potential evapotranspiration by improved machine learning methods using limited climatic data
,
Water
,
15
(
3
),
486
.
Munawar
H. S.
,
Hammad
A.
,
Ullah
F.
&
Ali
T. H.
(
2019
). ‘
After the flood: A novel application of image processing and machine learning for post-flood disaster management
’,
Proceedings of the 2nd International Conference on Sustainable Development in Civil Engineering (ICSDC 2019)
,
Jamshoro, Pakistan
, pp.
5
7
.
Munawar
H. S.
,
Ullah
F.
,
Qayyum
S.
&
Heravi
A.
(
2021
)
Application of deep learning on UAV-based aerial images for flood detection
,
Smart Cities
,
4
(
3
),
1220
1242
.
Naganna
S. R.
,
Beyaztas
B. H.
,
Bokde
N.
&
Armanuos
A. M.
(
2020
)
On the evaluation of the gradient tree boosting model for groundwater level forecasting
,
Knowledge-Based Engineering and Sciences
,
1
(
01
),
48
57
.
Ng
K. W.
,
Huang
Y. F.
,
Koo
C. H.
,
Chong
K. L.
,
El-Shafie
A.
&
Ahmed
A. N.
(
2023
)
A review of hybrid deep learning applications for streamflow forecasting
,
Journal of Hydrology
,
625
,
130141
.
Ni
L.
,
Wang
D.
,
Wu
J.
,
Wang
Y.
,
Tao
Y.
,
Zhang
J.
&
Liu
J.
(
2020
)
Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model
,
Journal of Hydrology
,
586
,
124901
.
Pham
B. T.
,
Jaafari
A.
,
Phong
T. V.
,
Yen
H. P. H.
,
Tuyen
T. T.
,
Luong
V. V.
&
Foong
L. K.
(
2021
)
Improved flood susceptibility mapping using a best first decision tree integrated with ensemble learning techniques
,
Geoscience Frontiers
,
12
(
3
),
101105
.
doi:10.1016/j.gsf.2020.11.003
.
Ridwan
W. M.
,
Sapitang
M.
,
Aziz
A.
,
Kushiar
K. F.
,
Ahmed
A. N.
&
El-Shafie
A.
(
2021
)
Rainfall forecasting model using machine learning methods: Case study Terengganu, Malaysia
,
Ain Shams Engineering Journal
,
12
(
2
),
1651
1663
.
Sahour
H.
,
Gholami
V.
,
Torkaman
J.
,
Vazifedan
M.
&
Saeedi
S.
(
2021
)
Random forest and extreme gradient boosting algorithms for streamflow modeling using vessel features and tree-rings
,
Environmental Earth Sciences
,
80
,
1
14
.
Schoppa
L.
,
Disse
M.
&
Bachmair
S.
(
2020
)
Evaluating the performance of random forest for large-scale flood discharge simulation
,
Journal of Hydrology
,
590
,
125531
.
Schulte
J. A.
(
2017
)
Sub-ensemble coastal flood forecasting: A case study of Hurricane Sandy
,
Journal of Marine Science and Engineering
,
5
(
4
),
59
.
Shada
B.
,
Chithra
N. R.
&
Thampi
S. G.
(
2022
)
Hourly flood forecasting using hybrid wavelet-SVM
,
Journal of Soft Computing in Civil Engineering
,
6
(
2
),
1
20
.
Shrestha
B. B.
,
Kawasaki
A.
&
Zin
W. W.
(
2021
)
Development of flood damage functions for agricultural crops and their applicability in regions of Asia
,
Journal of Hydrology: Regional Studies
,
36
,
100872
.
Tehrany
M. S.
&
Kumar
L.
(
2018
)
The application of a Dempster–Shafer-based evidential belief function in flood susceptibility mapping and comparison with frequency ratio and logistic regression methods
,
Environmental Earth Sciences
,
77
,
1
24
.
Wang
F.
&
Zai
Y.
(
2023
)
Image segmentation and flow prediction of digital rock with U-net network
,
Advances in Water Resources
,
172
,
104384
.
Wang
X.
,
Kinsland
G.
,
Poudel
D.
&
Fenech
A.
(
2019
)
Urban flood prediction under heavy precipitation
,
Journal of Hydrology
,
577
,
123984
.
Wang
K.
,
Band
S. S.
,
Ameri
R.
,
Biyari
M.
,
Hai
T.
,
Hsu
C.-C. …
&
Mosavi
A.
(
2022a
)
Performance improvement of machine learning models via wavelet theory in estimating monthly river streamflow
,
Engineering Applications of Computational Fluid Mechanics
,
16
(
1
),
1833
1848
.
Wang
S.
,
Peng
H.
,
Hu
Q.
&
Jiang
M.
(
2022c
)
Analysis of runoff generation driving factors based on hydrological model and interpretable machine learning method
,
Journal of Hydrology: Regional Studies
,
42
,
101139
.
Xu
R.
,
Qiu
D.
,
Gao
P.
,
Wu
C.
,
Mu
X.
&
Ismail
M.
(
2024
)
Prediction of streamflow based on the long-term response of streamflow to climatic factors in the source region of the Yellow River
,
Journal of Hydrology: Regional Studies
,
52
,
101681
.
Yan
L.
,
Lei
Q.
,
Jiang
C.
,
Yan
P.
,
Ren
Z.
,
Liu
B.
&
Liu
Z.
(
2022
)
Climate-informed monthly runoff prediction model using machine learning and feature importance analysis
,
Frontiers in Environmental Science
,
10
,
1049840
.
Yang
T.
,
Asanjan
A. A.
,
Welles
E.
,
Gao
X.
,
Sorooshian
S.
&
Liu
X.
(
2017
)
Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information
,
Water Resources Research
,
53
(
4
),
2786
2812
.
Yuan
X.
,
Chen
C.
,
Lei
X.
,
Yuan
Y.
&
Muhammad Adnan
R.
(
2018
)
Monthly runoff forecasting based on LSTM–ALO model
,
Stochastic Environmental Research and Risk Assessment
,
32
,
2199
2212
.
Zahura
F. T.
&
Goodall
J. L.
(
2022
)
Predicting combined tidal and pluvial flood inundation using a machine learning surrogate model
,
Journal of Hydrology: Regional Studies
,
41
,
101087
.
Zhang
Y.
,
Chiew
F. H. S.
,
Li
M.
&
Post
D.
(
2018
)
Predicting runoff signatures using regression and hydrological modeling approaches
,
Water Resources Research
,
54
(
10
),
7859
7878
.
Zhang
X.
,
Liu
P.
,
Cheng
L.
,
Xie
K.
,
Han
D.
&
Zhou
L.
(
2021
)
The temporal variations in runoff-generation parameters of the Xinanjiang model due to human activities: A case study in the upper Yangtze River Basin, China
,
Journal of Hydrology: Regional Studies
,
37
,
100910
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).