Pakistan is highly prone to devastating floods, as seen in the June 2010 and September 2022 disasters. The 2010 floods affected 20 million people, causing 1,985 fatalities. In 2022, approximately 33 million individuals were impacted, with multiple districts declared as ‘calamity struck’ by the National Disaster Management Authority (NDMA). Since June 14th, these floods have caused the loss of approximately 1,400 lives. Hence, the urgent necessity to develop an accurate and efficient flood risk prediction system for early warning purposes in Pakistan. This research aims to address this need by developing a predictive model using machine learning (ML) techniques such as k-nearest neighbors (KNN), support vector machine (SVM), Naive Bayes (NB), artificial neural network (ANN), and random forest (RF) for flood risk prediction in the Indus Basin of Pakistan. The performance of each model was evaluated based on accuracy, precision, recall, and F-measure. The findings revealed that SVM outperformed the other models, achieving an accuracy of 82.40%. Consequently, the results of this study can provide valuable insights for organizations to proactively mitigate frequent flood occurrences in Pakistan, aiding preventive actions.

  • ML models KNN, SVM, NB, ANN, and RF were used to predict floods in the Indus River Basin, Pakistan.

  • The dataset consisted of five features: date, precipitation, temperature, monthly discharge, and flood occurrence, covering the years 1985 to 2013.

  • Performance evaluation of the models included metrics such as accuracy, precision, recall, and F-measure.

  • SVM demonstrated the highest accuracy among all the models tested.

Flooding is a common and devastating natural disaster worldwide. Countries prone to floods face significant human casualties, environmental damage, financial losses, and property destruction. Governments are urged to create accurate flood-prone area maps and implement long-term strategies focusing on mitigation, protection, and preparedness to manage flood risks effectively (Serra-Llobet et al. 2013). Flood prediction models play a crucial role in assessing hazards and managing severe flood events (Aljohani et al. 2023). Accurate flood forecasts are highly valuable for water resource management, policy recommendations, and future evacuation modeling (Mosavi et al. 2018a). Thus, the significance of improved systems for short-term and long-term flood and other hydrological event prediction is emphasized in order to mitigate damage (Mosavi et al. 2018b). However, the prediction of flood and its occurrence location is fundamentally complex due to the dynamic nature of climate conditions (Manomy 2020). Current flood prediction models rely on simplified assumptions and are predominantly data-driven (Ighile et al. 2022). Thus, to replicate the complicated mathematical representations of physical processes and basin behavior, such models benefit from specialized methodologies like event-driven, empirical black box, lumped and distributed, stochastic, deterministic, continuous, and hybrids (Mosavi et al. 2018a).

Flood prediction research has been ongoing for several decades, yet it remains one of the most challenging issues to address. In general, two different techniques for flood prediction are used in flood prediction scenarios. The first is physical principle-based modeling (Maspo et al. 2020) that includes models based on the principles of physical processes such as the rainfall-runoff model, hydrodynamic model, and soil and water assessment tools (Fernández-Pato et al. 2016). Although previous research has demonstrated that physical principle-based models may accurately predict floods in a variety of conditions, the model is complicated and computationally time-consuming. It also needs a significant number of input parameters that characterize the physical properties, but the necessary information is not always accessible and difficult to gather. In addition, using a physical model requires a solid understanding of hydrology, skill competency, and the capacity to calculate complicated models (Kumar et al. 2023).

In addition to physical models, data-driven models have a long history in flood modeling and have recently gained prominence. To give better insight, data-driven techniques of prediction incorporate observed climatic indices and hydro-meteorological characteristics (Chu et al. 2020). Among the standard models, autoregressive moving average (ARMA), multiple linear regression (MLR), and autoregressive integrated moving average (ARIMA) are conventional statistical models that are frequently employed for flood frequency analysis (FFA) approaches utilized in flood forecasting (Ab Razak et al. 2018). When physical principle-based models and statistical techniques are compared, statistical approaches are shown to be more efficient in terms of processing cost and generality; also, more components are required to process physically based models. Traditional statistical approaches, on the other hand, are believed to be less accurate in predicting floods and are unsuitable for short-term flood prediction (Mosavi et al. 2018b).

The shortcomings of the previously described physically based and statistical models stimulate the use of sophisticated data-driven methods, such as machine learning (ML). Another reason for their appeal is that they may statistically express flood nonlinearity using only historical data without requiring an understanding of the underlying physical processes (Toth et al. 2000). Another factor that contributes to the popularity of ML models is their low computing cost; the ML process, such as model training, testing, and assessment, is simple to apply and improve (Plötz 2021). According to Mosavi et al. (2017), the ML technique is useful for use in flood prediction, and its performance exceeds conventional approaches and has been proven to be more accurate.

ML is a branch of artificial intelligence (AI) that is used to instill regularities and patterns, allowing for easier implementation with low computation costs, as well as quick training, validation, testing, and evaluation, with high performance compared to physical models and relatively less complexity (Mekanik et al. 2013). The development of ML techniques over the past two decades has shown that they are suitable for flood forecasting and outperform them at a reasonable pace (Dou et al. 2020). Additionally, the literature has several examples of successful trials using ML-based quantitative precipitation forecasting (QPF) for a variety of lead-time forecasts (Jhong et al. 2022). The accuracy of predictions made using ML models was higher than that of conventional statistical models (Singal et al. 2013). Ortiz-García et al. (2014) demonstrated how ML approaches may be used to efficiently predict complicated hydrological systems such as floods. Many ML methods, KNN, support vector machine (SVMs), NB, ANN, and RF, have been shown to be useful for both short-term and long-term flood forecasting (Taherei Ghazvinei et al. 2018). Furthermore, it was demonstrated that the performance of ML might be enhanced by combining it with other ML approaches, soft computing techniques, numerical simulations, and/or physical models (Karniadakis et al. 2021). These applications produced more robust and efficient models capable of learning complicated flood systems in an adaptive way.

Taking all of these ML models into account, we were motivated to develop a deep learning-based flood early warning system for Pakistan that takes temperature, precipitation, and discharge into account to predict the possibility of flood occurrence before heavy stream flow and water levels lead to flooding. The performance of ML algorithms, including KNN, SVM, NB, ANN, and RF, is compared in terms of error and accuracy to identify the optimal approach for flood prediction prior to flooding. The Flood Early Prediction System is based on the historical data between 1985 and 2013.

The main objectives of the research are as follows:

  • Collect temperature, precipitation, discharge, and flood occurrence datasets and merge them into a single dataset.

  • Validate the effectiveness of KNN, SVM, NB, ANN, and RF for flood prediction.

  • Conduct a comparative analysis of KNN, SVM, NB, ANN, and RF in terms of accuracy and error for flood prediction.

Study area description

The Indus River Basin, with a total drainage area of about 1.08 × 106 km2, is regarded as the largest transboundary river basin in the world. Pakistan, India, China, and Afghanistan each provide a percentage of 56, 26, 10, and 6.7% of the total drainage, respectively (Ahmad et al. 2018). That makes it a geopolitically complex region as shown in Figure 1. It is extended between 32.48–37.07°N and 67.33–81.83°E. The elevation in the upper Indus Basin, which spans an area of 289,000 km2, has an average elevation of 3,750 m above sea level and varies from 200 to 8,500 m above sea level (Garee et al. 2017). Together, these mountains include 11,000 glaciers, making it one of the most glaciated regions in the world with about 22,000 km2 of glacier surface area (Lutz et al. 2016). The study area is regarded as Pakistan's principal supply of fresh water and contributes significantly to the nation's sustained economic growth (Ishaque et al. 2023).
Figure 1

Overview of the study area showing Pakistan Map, Indus River Basin with streams and Delineated Indus River Basin.

Figure 1

Overview of the study area showing Pakistan Map, Indus River Basin with streams and Delineated Indus River Basin.

Close modal

Data collection

This research study is based on the following datasets, whose details are given below.

Digital elevation model

Digital elevation model (DEM) data for the study area have been downloaded from the National Aeronautics and Space Administration (NASA) (https://earthdata.nasa.gov/). The resolution of the Global Digital Elevation Model (GDEM) is 30 m × 30 m. DEM data are used for watershed delineation. The delineated watershed is demonstrated in Figure 1.

Hydroclimatic datasets

Meteorological data, which include precipitation and temperature, were collected from Pakistan Meteorological Department (PMD) from 1985 to 2013. Similarly, hydrological data, which include monthly streamflow, were obtained from Water and Power Development Authority (WAPDA) between 1985 and 2013.

Methodology

Preprocessing of data

In the process of building a flood risk prediction model utilizing ML techniques such as KNN (K-nearest neighbors), SVMs, NB (Naive Bayes), ANN (artificial neural networks), and RF (random forest), preparing the data correctly is crucial. This preparation phase includes a vital task known as preprocessing, which is necessary for improving the model's accuracy and reliability.

A significant challenge encountered during preprocessing is dealing with imbalanced data. Imbalanced data occur when the categories (or classes) we want to predict are unevenly represented in the dataset. For example, if we're trying to predict flood events, there might be many more years without floods (‘no’) compared to years with floods (‘yes’). This imbalance can skew the model's performance, as it may become overly good at predicting the majority class but poor at detecting the less common, yet crucial, instances. To tackle this problem, the data are resampled as part of the preprocessing. Resampling aims to adjust the dataset so that each class (in this case, ‘yes’ for years with floods and ‘no’ for years without) is more evenly represented. This could mean increasing the instances of the minority class, decreasing the instances of the majority class, or both, to achieve a balance that allows for more effective training of the ML models.

The initial step in this preprocessing was to label the data based on historical records, categorizing each year as either ‘yes’ (a year with a flood event) or ‘no’ (a year without a flood event). This categorization is essential for the models to learn from past data and make predictions about future flood risks.

Following this labeling and resampling to address data imbalance, the preprocessed data are then used to train and evaluate various ML models. Each model (KNN, SVM, NB, ANN, and RF) learns from the balanced dataset in its way, attempting to predict whether a given year will experience a flood. This training process involves learning the patterns and characteristics of years that led to floods versus those that didn't, aiming to accurately forecast future flood events based on this learned knowledge.

ML models

The adoption of a different ML approach in this study for assessing the Flood Risk Prediction within Pakistan's Indus River Basin is driven by its unparalleled capabilities in capturing complex, nonlinear relationships and intricate patterns inherent in input data. ML's intrinsic ability for automatic feature learning eliminates the challenges of manual feature engineering, ensuring scalability and comprehensive coverage of vast datasets. Specifically tailored for handling temporal and spatial dynamics, models such as SVMs, RF, NB, ANN, and KNN excel in capturing nuanced variations over time and space. Furthermore, the demonstrated superior predictive accuracy of RF models, coupled with their adaptability to evolving flood risk prediction, underscores their pivotal role in generating insights crucial for informed mitigation and adaptation strategies. The research methodology of the current research work is demonstrated by a flow chart as shown in Figure 2. Multiple ML models were used to predict floods based on precipitation, temperature, and discharge data. Accuracy, precision, recall, F1 score, and MCC are used as evaluation indicators to evaluate the prediction performance of the models.
Figure 2

System flow diagram.

Figure 2

System flow diagram.

Close modal
K-nearest neighbors

KNN is a non-parametric technique used for classification and regression. It works by assigning an item to a class based on the majority of votes from its closest neighbors. The number of neighbors, denoted as K, is chosen. The Euclidean distance is calculated to determine the KNN. The category with the highest number of neighbors is assigned to the item. This process creates a KNN regression output, where the value is the mean of its neighboring values.

Support vector machine

SVM is a supervised learning machine used in flood modeling. It builds models based on statistical learning theory and the structural risk reduction rule. SVMs are reliable and efficient algorithms for flood prediction, offering high generalization and efficiency. They are suitable for both linear and nonlinear classification and have been applied successfully to various flood prediction cases. SVMs are widely recognized and employed by hydrologists in flood prediction due to their excellent performance and generalization ability.

Naive Bayes

The NB classifier is based on Bayes' theorem and assumes independence between predictors. It is simple to construct and performs well, especially with large datasets. Despite its simplicity, it often outperforms more advanced classification algorithms, making it widely used. By applying Bayes' theorem, we can calculate the probability of an event based on the probability of another event that has already occurred.

Artificial neural network

ANNs simulate biological neural networks using interconnected neuron units. They are popular and effective ML algorithms known for their adaptability and ability to model complex flood processes. ANNs analyze historical data and are considered reliable tools for developing black-box models of rainfall-flood interactions, river flow forecasting, and discharge prediction. They outperform traditional statistical models in terms of accuracy. ANNs provide precise approximations and fault tolerance, making them suitable for modeling complex and nonlinear relationships.

Random forest

RF is a learning method that utilizes multiple independent decision trees to improve classification results. In hydrology, RF has been used to develop flood hazard risk models and provide advanced flood warnings to users. It has proven effective in modeling flood forecasts and enhancing flood prediction accuracy.

Metrics for model comparison

Each prediction model has its own method for checking and assessing its performance. The evaluation is done to check if there is any similarity or consistency between the observed outcomes and the expected results, or between the predicted results of multiple models. For this study, accuracy, precision, recall, F-score and mcc were used as evaluation metrics because they are widely used by a majority of researchers, including those involved in flood risk prediction, when compared to other evaluation metrics such as root mean square error and model construction times. A confusion matrix, as illustrated in Table 1, can be used to calculate accuracy.

Table 1

Confusion matrix

No (prediction)Yes (prediction)
No (actual) True negative (TN) False positive (FP) 
Yes (actual) False negative (FN) True positive (TP 
No (prediction)Yes (prediction)
No (actual) True negative (TN) False positive (FP) 
Yes (actual) False negative (FN) True positive (TP 

The columns represent the prediction class and the rows show the actual class target. The flood outcome is represented with the label YES, and no-flood is represented with the label NO. Therefore, diagonal elements (TN, TP) in Table 1 show the true predictions and the other elements (FN, FP) reflect the false predictions. For example, there are two outcomes in the flood prediction, which are flood and no-flood. True positive (TP) means correct flood result prediction and true negative (TN) means correct no-flood result prediction while false positive (FP) means incorrect flood result prediction and false negative (FN) means incorrect no-flood result prediction. If a target class is predicted as flood (YES) even though it is a no-flood (NO) target class, this test result is added to the FP in the table. Thus, accuracy in the confusion matrix is defined as in the following equation:
formula
(1)
Precision, also known as positive predictive value, is a metric that quantifies the accuracy of positive predictions made by a model. It is calculated by dividing the total number of correctly classified positive samples by the total number of actual positive samples. Precision indicates how well the model identifies TP instances and is a crucial metric for evaluating the reliability of a classification model. It is particularly important in scenarios where the cost of false positives is high, such as in spam email filtering or fraud detection.
formula
(2)
Recall, also referred to as sensitivity, is a metric that measures the ability of a model to correctly identify positive samples. It is calculated by dividing the total number of correctly classified positive samples by the total number of predicted positive samples. Recall provides insights into the model's ability to capture TP instances and is a valuable metric for evaluating the performance of a classification model, especially in situations where identifying positive cases is of high importance, such as in medical diagnostics or anomaly detection.
formula
(3)
The F-measure, also known as the F1 score or F score, is a metric that combines the precision and recall of a sample. It is calculated as the weighted harmonic mean of precision and recall. The F-measure provides a single value that represents the balance between precision and recall, making it a useful metric for evaluating the performance of a classification model.
formula
(4)
Matthew's correlation coefficient (MCC) is a metric that takes into account all divisions of the confusion matrix. It provides a score between −1 and +1, where a positive score indicates a perfect model and a negative score indicates poor performance. MCC is easy to interpret, making it a useful metric for evaluation.
formula
(5)

Performance of ML models

To evaluate the prediction accuracy of the preprocessed dataset, various ML techniques including KNN, SVM, NB, ANN, and RF were employed. The results of prediction accuracy and other relevant metrics for each method are presented in Tables 26. These tables provide a comprehensive assessment of the performance of each technique, allowing for a comparative analysis of their effectiveness in predicting flood occurrences. The metrics included in the tables offer insights into the accuracy, precision, recall, and F1 score of the models, enabling a comprehensive evaluation of their predictive capabilities (Syed et al. 2021).

Table 2

KNN performance metrics

Detailed accuracy termsValue
Accuracy 79.80% 
Precision 0.70 
Recall 0.69 
F1 score 0.62 
MCC 0.58 
Detailed accuracy termsValue
Accuracy 79.80% 
Precision 0.70 
Recall 0.69 
F1 score 0.62 
MCC 0.58 
Table 3

SVM performance metrics

Detailed accuracy termsValue
Accuracy 82.40% 
Precision 0.80 
Recall 0.72 
F1 score 0.68 
MCC 0.59 
Detailed accuracy termsValue
Accuracy 82.40% 
Precision 0.80 
Recall 0.72 
F1 score 0.68 
MCC 0.59 
Table 4

NB performance metrics

Detailed accuracy termsValue
Accuracy 81.30% 
Precision 0.79 
Recall 0.71 
F1 score 0.68 
MCC 0.66 
Detailed accuracy termsValue
Accuracy 81.30% 
Precision 0.79 
Recall 0.71 
F1 score 0.68 
MCC 0.66 
Table 5

ANN performance metrics

Detailed accuracy termsValue
Accuracy 80.10% 
Precision 0.70 
Recall 0.70 
F1 score 0.60 
MCC 0.53 
Detailed accuracy termsValue
Accuracy 80.10% 
Precision 0.70 
Recall 0.70 
F1 score 0.60 
MCC 0.53 
Table 6

RF performance metrics

Detailed accuracy termsValue
Accuracy 81.40% 
Precision 0.718 
Recall 0.724 
F1 score 0.721 
MCC 0.657 
Detailed accuracy termsValue
Accuracy 81.40% 
Precision 0.718 
Recall 0.724 
F1 score 0.721 
MCC 0.657 

Based on the analysis of multiple ML algorithms, it was found that (KNN), (SVM), (NB), (ANN), and (RF) exhibit high accuracy and recall. This implies that the predicted results are both correct and positive. The MCC metric, which ranges between −1 and +1, serves as an indicator of the quality of binary classification. A positive value suggests a perfect classification. Furthermore, the F1 score, which considers the test's accuracy, also indicates a near-perfect performance (Chicco & Jurman 2020). These findings highlight the strong predictive capabilities of the evaluated algorithms and provide valuable insights into their effectiveness in flood prediction tasks.

The results from Tables 26 demonstrate that SVM achieves the highest accuracy among the analyzed algorithms, with a value of 82.40%.RF follows closely in second place with an accuracy of 81.40%, while NB, ANN, and KNN exhibit accuracies of 81.30, 80.10, and 79.80%, respectively. These findings indicate that SVM performs the best in accurately predicting flood occurrences in the study area. The study on predicting flood discharge using hybrid particle swarm optimization (PSO)-SVM algorithm conducted by Samantaray et al. (2023) also demonstrates that SVM performs better than the other models. RF also shows strong performance, albeit slightly lower than SVM. Lee et al. (2017) conducted a study on predicting flood susceptibility using RF. The results showed that the RF model achieved validation accuracies of 78.78% and 79.18% for the regression and classification algorithms, respectively. Another study conducted by Aldiansyah & Wardani (2023) found that cross-validation of RF exhibited strong performance in predicting flood susceptibility. The evaluation metrics yielded impressive values, including area under the curve (AUC) = 0.99, correlation = 0.97, true skill statistics (TSS) = 0.90, and deviance = 0.05. The results provide valuable insights for decision-making and highlight the potential of SVM and RF as effective algorithms for flood prediction in the Indus River Basin.

Table 7 presents the confusion matrix for KNN, SVM, NB, ANN, and RF models, focusing on the target classes of ‘no flood’ (NO) and ‘flood’ (YES). The confusion matrix provides a detailed breakdown of the performance of each model in terms of TP, TN, FP, and FN predictions. This matrix allows for a comprehensive assessment of how well each model classifies instances into the respective flood and no flood categories. By examining the values in the confusion matrix, researchers can gain insights into the strengths and weaknesses of each model's predictions, aiding in the understanding and comparison of their performance in flood prediction tasks.

Table 7

Confusion matrix for ML models

No (prediction)%Yes (prediction)%
 KNN 
NO (actual)% 72.1 69.2 
YES (actual)% 27.9 30.8 
 SVM 
NO (actual)% 72.1 
YES (actual)% 27.9 100 
 NB 
NO (actual)% 76.8 50 
YES (actual)% 23.2 50 
 ANN 
NO (actual)% 71.4 66.7 
YES (actual)% 28.6 33.3 
 RF 
NO (actual)% 79.4 50 
YES (actual)% 20.6 50 
No (prediction)%Yes (prediction)%
 KNN 
NO (actual)% 72.1 69.2 
YES (actual)% 27.9 30.8 
 SVM 
NO (actual)% 72.1 
YES (actual)% 27.9 100 
 NB 
NO (actual)% 76.8 50 
YES (actual)% 23.2 50 
 ANN 
NO (actual)% 71.4 66.7 
YES (actual)% 28.6 33.3 
 RF 
NO (actual)% 79.4 50 
YES (actual)% 20.6 50 

In Figure 3, the distribution diagram provides a visual representation of the predicted outcomes of various ML models for flood prediction in the Indus River Basin, Pakistan. The diagram showcases the performance of each model in terms of their predicted flood occurrences. Figure 3 illustrates the distinct advantage of the SVM model. It shows that SVM predicts flood occurrences with a higher level of accuracy compared to other ML models.
Figure 3

Distribution diagrams for ML models.

Figure 3

Distribution diagrams for ML models.

Close modal
Figure 4 represents the violin box diagram which provides a comprehensive visualization of the distribution of performance across multiple ML models. It combines the characteristics of a box plot and a kernel density plot to showcase the distribution of performance metrics in a compact manner. The violin box diagram compares the performance of different ML models in flood prediction for the Indus Basin River. Each model is represented by a ‘violin’ shape, which is essentially a mirrored density plot on either side of a central box plot. The width of the violin corresponds to the density of performance scores, with wider areas indicating higher density. The height of the violin represents the range of performance scores, with taller violins indicating a broader range of scores. The box inside the violin typically shows the interquartile range (IQR), median, and possibly other statistical measures. Analyzing Figure 4, it can be observed that the violin corresponding to the SVM model is comparatively wider and taller than the other violins, indicating a higher density and a broader range of performance scores. This further reinforces the finding that SVM outperforms the other ML models in flood prediction for the Indus Basin River.
Figure 4

Violin plots showing the distribution of ML models.

Figure 4

Violin plots showing the distribution of ML models.

Close modal

This study delves into flood risk prediction within Pakistan's Indus River Basin, employing various ML techniques, including SVMs, RF, NB, ANN, and KNN. The overarching goal is to craft predictive models conducive to flood risk anticipation. The dataset encompasses four crucial features: precipitation, temperature, monthly discharge, and flood occurrence. The results implied that SVM achieves the highest accuracy among the analyzed algorithms, with a value of 82.40%. RF follows closely in second place with an accuracy of 81.40%, while NB, ANN, and KNN exhibit accuracies of 81.30, 80.10, and 79.80%, respectively. These findings indicate that SVM performs the best in accurately predicting flood occurrences in the study area. Such findings hold significant promise for aiding both governmental and non-governmental entities in fortifying preventive measures against the prevalent threat of flooding in Pakistan.

However, it's imperative to acknowledge the study's limitations, chiefly stemming from the validation of ML models solely with data spanning from 1985 to 2013. While the possibility of utilizing more recent data holds promise for bolstering accuracy, this constraint is not deemed substantial.

The study recommends avenues for future exploration. Integrating topographical factors and additional hydrological features namely floodwater level, stands out as a potential enhancement to flood estimation precision. Furthermore, leveraging the latest dataset could pave the way for heightened accuracy in flood prediction. Future research initiatives might entail incorporating supplementary variables and conducting uncertainty analyses to refine prediction precision. Ultimately, ML models emerge as indispensable tools for attaining precise streamflow predictions and deepening our comprehension of hydrological processes, facilitating more effective flood mitigation strategies.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Ab Razak
N.
,
Aris
A.
,
Ramli
M.
,
Looi
L.
&
Juahir
H.
2018
Temporal flood incidence forecasting for Segamat River (Malaysia) using autoregressive integrated moving average modelling
.
Journal of Flood Risk Management
11
,
S794
S804
.
Ahmad
I.
,
Zhang
F.
,
Tayyab
M.
,
Anjum
M. N.
,
Zaman
M.
,
Liu
J.
,
Farid
H. U.
&
Saddique
Q.
2018
Spatiotemporal analysis of precipitation variability in annual, seasonal and extreme values over upper Indus River basin
.
Atmospheric Research
213
,
346
360
.
Aldiansyah
S.
&
Wardani
F.
2023
Evaluation of flood susceptibility prediction based on a resampling method using machine learning
.
Journal of Water and Climate Change
14
(
3
),
937
961
.
doi:10.2166/wcc.2023.494
.
Aljohani
A.
,
Alkhodre
A.
,
Abi Sen
A.
,
Rama
M.
,
Alzahrani
B.
&
Siddiqui
M. S.
2023
Flood prediction using hydrologic and ML-based modeling: A systematic review
.
International Journal of Advanced Computer Science and Applications
14
.
doi:10.14569/IJACSA.2023.0141155
.
Fernández-Pato
J.
,
Caviedes-Voullième
D.
&
García-Navarro
P.
2016
Rainfall/runoff simulation with 2D full shallow water equations: Sensitivity analysis and calibration of infiltration parameters
.
Journal of Hydrology
536
,
496
513
.
https://doi.org/10.1016/j.jhydrol.2016.03.021
.
Ishaque
W.
,
Mukhtar
M.
&
Tanvir
R.
2023
Pakistan's water resource management: Ensuring water security for sustainable development
.
Frontiers in Environmental Science
11
.
doi:10.3389/fenvs.2023.1096747
.
Jhong
B.-C.
,
Lin
C.-Y.
,
Jhong
Y.-D.
,
Chang
H.-K.
,
Chu
J.-L.
&
Fang
H.-T.
2022
Assessing the effective spatial characteristics of input features through physics-informed machine learning models in inundation forecasting during typhoons
.
Hydrological Sciences Journal
67
(
10
),
1527
1545
.
Karniadakis
G. E.
,
Kevrekidis
I. G.
,
Lu
L.
,
Perdikaris
P.
,
Wang
S.
&
Yang
L.
2021
Physics-informed machine learning
.
Nature Reviews Physics
3
(
6
),
422
440
.
Lee
S.
,
Kim
J.-C.
,
Jung
H.-S.
,
Lee
M. J.
&
Lee
S.
2017
Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea
.
Geomatics, Natural Hazards and Risk
8
(
2
),
1185
1203
.
doi:10.1080/19475705.2017.1308971
.
Lutz
A. F.
,
Immerzeel
W. W.
,
Kraaijenbrink
P. D. A.
,
Shrestha
A. B.
&
Bierkens
M. F. P.
2016
Climate change impacts on the upper Indus hydrology: Sources, shifts and extremes
.
PLoS ONE
11
(
11
),
e0165630
.
doi:10.1371/journal.pone.0165630
.
Manomy
K. V.
2020
Flood prediction and tracking trapped
International Journal of Engineering Research and, V9
.
doi:10.17577/IJERTV9IS060375
.
Maspo
N.-A.
,
Harun
A. N. B.
,
Goto
M.
,
Cheros
F.
,
Haron
N. A.
&
Nawi
M. N. M.
2020
Evaluation of machine learning approach in flood prediction scenarios and its input parameters: A systematic review
. In:
Paper Presented at the IOP Conference Series: Earth and Environmental Science
.
Mosavi
A.
,
Rabczuk
T.
&
Varkonyi-Koczy
A. R.
2017
Reviewing the novel machine learning tools for materials design
. In:
Paper Presented at the International Conference on Global Research and Education
.
Mosavi
A.
,
Bathla
Y.
&
Varkonyi-Koczy
A.
2018a
Predicting the future using web knowledge: State of the art survey
. In:
Paper Presented at the Recent Advances in Technology Research and Education: Proceedings of the 16th International Conference on Global Research and Education Inter-Academia 2017, 16
.
Samantaray
S.
,
Sahoo
A.
&
Agnihotri
A.
2023
Prediction of flood discharge using hybrid PSO-SVM algorithm in Barak River Basin
.
MethodsX
10
,
102060
.
https://doi.org/10.1016/j.mex.2023.102060
.
Serra-Llobet
A.
,
Tàbara
J. D.
&
Sauri
D.
2013
The Tous dam disaster of 1982 and the origins of integrated flood risk management in Spain
.
Natural Hazards
65
,
1981
1998
.
Singal
A. G.
,
Mukherjee
A.
,
Elmunzer
B. J.
,
Higgins
P. D.
,
Lok
A. S.
,
Zhu
J.
,
Marrero
J. A.
&
Waljee
A. K.
2013
Machine learning algorithms outperform conventional regression models in predicting development of hepatocellular carcinoma
.
The American Journal of Gastroenterology
108
(
11
),
1723
.
Syed
S.
,
Morseth
B.
,
Hopstock
L.
&
Horsch
A.
2021
A novel algorithm to detect non-wear time from raw accelerometer data using deep convolutional neural networks
.
Scientific Reports
11
.
doi:10.1038/s41598-021-87757-z
.
Taherei Ghazvinei
P.
,
Hassanpour Darvishi
H.
,
Mosavi
A.
,
Yusof
K. B. W.
,
Alizamir
M.
,
Shamshirband
S.
&
Chau
K. W.
2018
Sugarcane growth prediction based on meteorological parameters using extreme learning machine and artificial neural network
.
Engineering Applications of Computational Fluid Mechanics
12
(
1
),
738
749
.
Toth
E.
,
Brath
A.
&
Montanari
A.
2000
Comparison of short-term rainfall prediction models for real-time flood forecasting
.
Journal of Hydrology
239
(
1–4
),
132
147
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).