ABSTRACT
Increased frequency and magnitude of flooding pose a significant natural hazard to urban areas worldwide. Mapping flood hazard areas are crucial for mitigating potential damage to human life and property. However, conventional hydrodynamic approaches are hindered by their extensive data requirements and computational expenses. As an alternative solution, this paper explores the use of machine learning (ML) techniques to map flood hazards based on readily available geo-environmental variables. We employed various ML classifiers, including decision tree (DT), random forest (RF), XGBoost (XGB), and k-nearest neighbor (kNN), to assess their performance in flood hazard mapping. Model evaluation was conducted using the area under the receiver operating characteristic curve (AUC) and root mean square error (RMSE). Our results demonstrated promising outcomes, with AUC values of 93% (DT), 97% (RF), 98% (XGB), and 91% (kNN) for the validation dataset. RF and XGB have slightly higher performance than DT and kNN and distance to river was the most important factor. The study highlights the potential of ML for urban flood modeling, offering reasonable accuracy and supporting early warning systems. By leveraging available geo-environmental variables, ML techniques provide valuable insights into flood hazard mapping, aiding in effective urban planning and disaster management strategies.
HIGHLIGHTS
ML techniques are feasible for flood mapping in developing countries where data for hydrologic and hydraulic modeling is difficult.
Distance to rivers, elevation, and distance to drainage emerge as crucial variables for accurate flood inundation modeling in urban areas.
Geo-environmental data analysis enables accurate flood hazard assessments, aiding in effective urban planning and mitigation strategies for flood-prone regions.
INTRODUCTION
In urban areas, pluvial flooding is often caused by inadequate drainage capacity, particularly during high-intensity rainfall events when the underground or surface drainage network becomes overwhelmed and water levels rise above the top of the drains, causing surface flooding (Mark et al. 2004). These floods are typically associated with high-intensity rainfall events (over 30 mm/h), but can also occur with lower-intensity rainfall in areas that are saturated, urbanized or have low permeability (Falconer et al. 2009).
The main factors enabling flooding are physiographic (natural and environmental), while socio-economic factors such as lack of awareness, absence of vital enabling institutions, and sporadic ongoing operation and maintenance schemes exacerbate the problem. The lack of flood control mechanisms and maintenance schemes also contributes to ineffective drainage systems and high flood vulnerability in densely populated areas.
Due to the effects of climate change and urbanization, such as increased runoff and peak flows, the current drainage systems will not be sufficient in the future (Mailhot & Duchesne 2010). These issues are often worsened by the city's physical characteristics, inadequate drainage infrastructure and management, and lack of awareness among residents (Mark et al. 2004). Generally, the occurrence of flooding in urban areas has diverse problems that include disruption of economic activity, loss of life, loss of property, and public health concerns. If future floods are not properly estimated and managed, they could cause significant damage to city infrastructure, including roads, transportation systems, and socio-economic sectors. This is particularly true for cities like Bahir Dar, where roads are not designed to serve as major drainage systems.
To address this issue, there is a need to manage additional runoff, improve the drainage network's conveyance capacity, and develop flood control and disaster prevention measures. Flood hazard assessments are crucial to understand flood risk and to take proactive measures that will protect urban communities from potential flood damage (Cengiz & Ercanoglu 2022). These assessments provide a foundation for developing flood risk management plans. The conventional flood hazard modeling and assessment method, hydrodynamic modeling still widely regarded as one of the most effective methods for flood hazard analysis and forecasting flood risks, as it can accurately depict the spatial and temporal variations of floods and their probabilities of occurrence. However, accuracies of the hydrodynamic models rely on detailed hydraulic information and time series flow data to calibrate model parameters (Fenicia et al. 2008), which are often lacking in many situations, particularly in urban catchments where data on drainage networks and accurate topographic information are scarce. Inaccurate data that may be used in hydrodynamic models that are based on numerical simulations can lead to error propagation, which highly compromises the usefulness of the modeling result from these techniques. Model complexity and inherent model uncertainties that exist in hydrodynamic models are additional shortcomings of hydrodynamic modeling which affect the result and effective communication of the result to decision-makers as well as to the wider public. Furthermore, hydrodynamic models depend on numerical solutions often are time-consuming and are not fit enough for real-time forecasting. Consequently, hydrological and hydraulic modeling may not be suitable for assessing flood hazards and mapping in urban areas (Bui et al. 2016) especially in data-scarce regions and in places where computational facilities are not well established.
On the contrary, in urban flood hazard assessments, machine learning (ML) algorithms combined with geospatial data have demonstrated the versatility and applicability of these methods in diverse contexts. This approach is especially beneficial when there is limited data available for hydrodynamic modeling. Numerous studies have successfully employed ML in flood analysis and modeling, including random forest classification (Lee et al. 2017), support vector machine (Pham et al. 2018), decision trees (Rahmati et al. 2019), artificial neural networks (Kia et al. 2012), logistic regression (Gholamnia et al. 2020), naive Bayes and naive Bayes tree (Kalantar et al. 2018), deep learning neural networks (DLNNs) (Bui et al. 2020), and linear discriminant analysis (LDA) (Dereli et al. 2021). These ML methods have been widely used for spatial prediction, vulnerability assessment, and mapping of natural hazard susceptibility due to their high accuracy and applicability in data-scarce environments, particularly in developing countries where data availability remains a significant challenge. Yet, the accuracy of ML approaches depends on data availability, data representation, and data sampling frequency, necessitating further investigation of their application in flood studies across different environmental settings.
Therefore, this research aims to test the ability of selected ML algorithms to model flood hazards to serve as an alternative approach to hydrodynamic models that require extensive data for parameterization and often require high computational inefficiency. Thus, the primary objective is to identify flood hazard areas within the urban catchment using ML techniques. The specific objectives include evaluating the accuracy of the decision tree (DT), random forest (RF), XGBoost classifier (XGB), and k-nearest neighbor (kNN) models in assessing urban flood hazards, comparing the flood hazard mapping performance of these models and determining the significance of different geo-environmental causal factors contributing to urban flooding. Bahir Dar City was used as a case study because it is a rapidly growing city in the Amhara region in Ethiopia that faces a significant risk of frequent flooding disasters. A survey of the city's drainage capacity found that the existing infrastructure is inadequate and frequently clogged with waste, causing flooding even during regular rainfall events, highlighting that it cannot accommodate increased runoff due to climate change and urban expansion (Derseh et al. 2023). Addressing the problem requires wider flood risk management options that range from managing the drainage system to creating public awareness of flood risk management.
MATERIALS AND METHODS
Study area
Bahir Dar, the capital of the Amhara Region in northern Ethiopia, is a populous city situated on the southern shore of Lake Tana. With its average elevation of 1,802 m above mean sea level (amsl), the city encompasses a vast area of approximately 6,500 hectares, excluding the satellite towns under the jurisdiction of the Bahir Dar City Administration.
Meteorological data from the Ethiopian Meteorological Agency reveals that Bahir Dar City has a temperature climate with an annual average minimum temperature of 12.6 °C and an annual average maximum temperature of 27.5 °C. It receives an average annual rainfall of around 1,416 mm, with a significant portion, approximately 60%, occurring during the months of July and August. The annual average evaporation depth in the city is 1,709 mm, with the evaporation rate being highly sensitive to temperature fluctuations ranging from a minimum of 1.72 mm/day during the rainy season to a maximum of 9.41 mm/day in the dry season (Dessie et al. 2015).
The city's topography is generally characterized by a flat slope of less than 2% in the area near the city center, and in some locations even below the nationally recommended 0.5% open channel slope. Two hills, Dibankie in the west and Abay-Mado in the east, drain surface flow toward the urban area. Lake Tana lies to the north, while the Abay River flood plains traverse the city from north to south, shaping the landscape of the southern suburban region. Groundwater flow from the south and east, originating from an underlying basaltic aquifer that extends up to 70 km southwest to the Sekela Mountains, also plays a significant role in providing water supply for the city (Nigate et al. 2016).
ML modeling approach
Lastly, a database of 464 flood event samples collected from the city was utilized to generate the datasets for the modeling process. Different ratios (i.e., 10/90, 20/80, 30/70, 40/60, 50/50, 60/40, 70/30, 80/20, and 90/10) were used to divide the datasets into the training and testing datasets for the performance assessment of models. Statistical indicator ‘accuracy’ was employed to evaluate the predictive capability of the models under differing training and testing ratios. The 5-fold cross-validation technique was applied to train the selected models (Bengio & Grandvalet 2003).
Factors controlling flood inundation
The selection of flood conditioning factors has a direct impact on the accuracy of mathematical models. The variables selected for this study were: elevation, slope, profile curvature, plane curvature, aspect, stream power index (SPI), distance to the river, distance to drainage system, Topographic Ruggedness Index (TRI), Topographic Position Index (TPI), land use, % imperviousness, and curve number which are shown in Figure 4. Because of the relatively low spatial variation in rainfall within the small study area (Wang et al. 2019), and the availability of only two rain gauges in Bahir Dar, the study did not include rainfall as a predictor variable. In the following sections, we describe in detail the nature of these 13 factors in relation to Bahir Dar City and how they control the occurrence of inundation.
Elevation (Elv.)
Elevation plays a crucial role in flood dynamics. Lower elevations generally result in flatter terrains, allowing streams and rivers to carry more water. Additionally, elevation can impact rainfall characteristics and vegetation cover which intern affect infiltration (Thompson et al. 2010). In our study, a digital elevation model (DEM) with 30-meter spatial resolution was utilized. The elevation in Bahir Dar City ranged from 1,648 to 2,247 m (Figure 4(a)). The city is situated at a relatively lower elevation compared with the surrounding hills, enabling surface flow inward from the outskirts.
Slope (S)
Slope plays a crucial role in determining flow direction and inundation patterns, with gentle or flat slopes being prone to floods and water-logging, while steep slopes generate higher velocities, facilitating faster runoff (Darabi et al. 2020). Slope data for this study were obtained from analysis in the ArcGIS environment using the DEM. The city, especially the downtown area, exhibits predominantly gentle slopes, with an average slope of less than 4% and even some locations below the locally recommended open channel slope of 0.5% (see Figure 4(c)).
Profile curvature (PC1)
Profile curvature measures how the slope changes along the direction of the steepest gradient (the opposite of the aspect) at each grid node (Yilmaz et al. 2012). Profile curvature grid files generate contour maps with isolines of equal slope change rate over the surface. Profile curvature is negative for convex flow profiles where the slope increases downhill and positive for concave flow profiles where the slope decreases downhill (see Figure 4(d)).
Plane curvature (PC2)
Plan curvature measures the contour curvature and reflects the rate of change of the terrain aspect angle in the horizontal plane, indicating either divergent or convergent water flow. It describes the local topographic morphometry and slope inclination changes (Wilson & Gallant 2000). Numerous studies have recognized curvature as a significant factor influencing flood dynamics. Curvature values can range from morphologically flat or zero curvature (−0.05 to 0.05) to convex or positive curvature (>0.05) and concave or negative curvature (<−0.05). These values influence the acceleration of runoff (Figure 4(b)), where convex slopes promote overland flow and may impact infiltration and soil saturation, while concave slopes decelerate overland flow and potentially enhance infiltration. Moreover, curvature influences the convergence or divergence of water during downslope flow, with concave curvature retaining more water and consequently increasing the risk of flooding.
Stream power index (SPI)
Topographic position index (TPI)
Terrain ruggedness index (TRI)
The TRI, as introduced by Riley et al. (1999), was employed in this research to quantify the variation in elevation between neighboring cells in a DEM. The TRI serves as an indicator of surface roughness, incorporating both natural features such as vegetation and man-made structures like buildings, which contribute to hydrodynamic friction. Regions with a flat topography are assigned a TRI value of zero, while mountainous areas with sharp ridges exhibit higher positive TRI values. The TRI values ranged from 0 to 4,463, as illustrated in Figure 4(g).
Distance to river (DR)
In flood modeling, the distance to the river has emerged as a crucial factor due to the elevated vulnerability of riverbanks and flood plains to inundation, as highlighted by Predick & Turner (2008). In this study conducted in Bahir Dar, the Euclidean distances to several rivers/streams, namely Channel A, Zenjero Wonz, Chimbil, Yegind, Amora Wonz (referred to locally as Angodgud), and Ayer Tena, were calculated as continuous values ranging from 0 to 4,998 m across the study area (see Figure 4(h)). These stream systems act as the primary drainage channels for stormwater originating from urban catchments before eventually flowing into the Abay River or Lake Tana.
Distance to drainage system (DD)
The susceptibility of certain areas to flooding during intense rainfall events in urban settings is heightened when they are located far away from drainage systems, as emphasized by Tehrany et al. (2015). In this study, a map depicting the distances to urban drainage systems was generated, representing a continuous range of values from 0 to 5,633 meters (refer to Figure 4(i)).
Land use land cover (LULC)
The extent of runoff during precipitation events varies significantly depending on the land use and land cover patterns. Built-up areas generate more surface runoff than vegetated areas (Tehrany et al. 2015). Consequently, the land use composition plays a crucial role in determining the level of flood risk. In this study, eight distinct land use classes were used (Derseh et al. 2023). These classes included public, residential, industrial, commercial, agricultural, park/recreational, and streets/traffic, as illustrated in Figure 4(k).
Curve number (CNII)
Theoretically, CNII values range from 0 to 100, with lower values indicating a lower potential for runoff and higher values indicating a higher potential for runoff (Zhao et al. 2018). In this study, a CN map was generated for the study area using the ArcCN-runoff extension within the ArcGIS environment. This map was developed based on data layers related to antecedent soil moisture, hydrologic soil group, and land use. The resulting map, presented in Figure 4(j), provides a visual representation of the spatial distribution of the runoff potential across the study area.
Percent imperviousness (%Imp.)
The process of urbanization typically results in the expansion of impervious surfaces such as buildings, roads, and pavements, which hinder water infiltration and amplify runoff. However, a growing concern in Bahir Dar City is the recent trend of converting previously green backyards into asphalt paved surfaces to facilitate pedestrian movement. This transformation contributes to elevated flood peaks and increased volumes in streams and rivers, while simultaneously reducing groundwater recharge and base flow. Moreover, the extent and distribution of impervious surfaces within a catchment can vary, thereby influencing the hydrologic response and flood risk (Feng et al. 2021). The percent imperviousness of the study area is represented in Figure 4(l).
Aspect (A)
Aspect is the orientation slope, measured clockwise in degrees from 0 to 360, where 0 is facing north, 90 is facing east, 180 is facing south, and 270 is facing west. It affects hydrologic processes and soil moisture and results of previous studies have shown that it can have an indirect influence on flooding (Haghizadeh et al. 2017). In this study, we divided slopes into five aspect classes of flat, north-facing, east-facing, south-facing, and west-facing (Figure 4(m)).
URBAN FLOOD HAZARD ML MODELS
Recent advancements in ML models have significantly contributed to the improvement of predictive flood hazard mapping. Numerous ML algorithms have demonstrated successful outcomes in predicting flood inundations (Mosavi et al. 2018). This study focuses on assessing the feasibility of urban flood inundation modeling. The binary occurrence of floods and non-floods was considered as the dependent variable, while 13 geo-environmental factors were used as explanatory variables. Four ML models were employed to examine the relationship between the explanatory factors and flood inundations, with the aim of generating accurate flood hazard maps. The selected models for this investigation are as follows: DT, RF, XGB, and kNN. The model setup of the four selected algorithms is summarized in Table 1.
No . | Algorithm categories . | Domain . | Settings . |
---|---|---|---|
1 | Decision Tree Classifier | [‘gini’, ‘entropy’] [5–13] [1–4] | Criterion Maximum depth Minimum sample split |
2 | k-nearest neighbor (kNN) | [1–50] [‘uniform’, ‘distance’] [‘minkowski’, ‘eculidean’, ‘manhattan’] [‘auto’, ‘ball tree’, ‘kd tree’, ‘brute’] [1, 2] | Number of NN Weights Metrics Algorithm p |
3 | Ensemble Algorithms (RF) | [100–1,000] [‘gini’, ‘entropy’] [10–110] [‘auto’, ‘sqrt’] | Number of estimators Criterion Minimum sample split Maximum features |
4 | Ensemble Algorithms (XGB) | [300–1,000] [0.1–0.1] [‘gbtree’, ‘gblinear’] [0–1] [0–1] [0.5–5] [0.1–1] | Number of estimators Learning rate Booster Gamma Reg_alpha Reg_lambda Subsample |
No . | Algorithm categories . | Domain . | Settings . |
---|---|---|---|
1 | Decision Tree Classifier | [‘gini’, ‘entropy’] [5–13] [1–4] | Criterion Maximum depth Minimum sample split |
2 | k-nearest neighbor (kNN) | [1–50] [‘uniform’, ‘distance’] [‘minkowski’, ‘eculidean’, ‘manhattan’] [‘auto’, ‘ball tree’, ‘kd tree’, ‘brute’] [1, 2] | Number of NN Weights Metrics Algorithm p |
3 | Ensemble Algorithms (RF) | [100–1,000] [‘gini’, ‘entropy’] [10–110] [‘auto’, ‘sqrt’] | Number of estimators Criterion Minimum sample split Maximum features |
4 | Ensemble Algorithms (XGB) | [300–1,000] [0.1–0.1] [‘gbtree’, ‘gblinear’] [0–1] [0–1] [0.5–5] [0.1–1] | Number of estimators Learning rate Booster Gamma Reg_alpha Reg_lambda Subsample |
Decision tree (DT)
The DT algorithm is based on a tree structure, where decisions are made at the branches and the predicted responses are represented by the leaf nodes (Breiman et al. 2017). In the context of this study, which involves binary classification, the branching process starts from a parent node in one layer and leads to two child nodes in the subsequent lower layer, with different variable values determining the branching. The optimization of Gini's diversity index is employed to identify the optimal bifurcation points, with the process stopping under certain conditions: (i) when a node contains only a single class of data, (ii) when a child node to be generated would have fewer than five data points, or (iii) when the number of layers exceeds a predefined criterion. In the case of classification trees (CTs), the final variables in a DT consist of a discrete set of values, where the leaves represent class labels and the branches represent conjunctions of feature labels.
Their transparency and the ability to offer a clear understanding of predictions by DT make them one of the choices in this study (Tehrany et al. 2015). DTs handle diverse data types that are common in urban settings, including mixed numerical and categorical data (e.g., elevation, land cover, infrastructure density). Generally, DTs are known for their computational efficiency and have gained popularity in ensemble forms for real-time flood modeling and prediction applications. Furthermore, DTs interpretability aids stakeholders in comprehending the model's decision-making process, crucial for effective urban flood hazard mapping.
Random forest classifier (RF)
RF is a non-parametric and tree-based ensemble technique introduced by Breiman (2001). In contrast to standard statistical methods, RF utilizes multiple DT models, which are intuitive and easy to interpret, rather than parametric models. RF is a classification and regression system that employs a large number of weak classifiers to classify and predict data. These classifiers are developed through a process that involves selecting different training subsets and utilizing the bagging method to create random DTs (Houborg & McCabe 2018). The final classification result is obtained by aggregating the classification outcomes of these DTs through voting. Unlike the conventional DT algorithm that selects the best attribute for node partitioning at each step, the RF introduces random attribute selection. This approach is akin to the tournament selection method employed in genetic algorithms. The RF exhibits two key random characteristics: random selection of classification features and random generation of training subsets that are independently and identically dispersed. These characteristics enable the rapid retrieval of different datasets without the need for repetitive processes, making it particularly advantageous for data categorization tasks.
This study selected RF because of their ability to capture complex relationships in data; often outperforming single DTs in flood predictions that might make them convenient particularly in urban settings with intricate environmental and infrastructure factors (Mahato et al. 2021). In addition, their lower susceptibility to overfitting and noise in data would also make them more reliable for predicting algorithms for real-world applications with inherent data uncertainties. Lastly, they provide insights into influential features in flood susceptibility that aid understanding of flood mechanisms and inform risk mitigation strategies.
XGBoost classifier
Gradient boosting (XGBoost) is a highly effective algorithm for supervised learning, known for its exceptional performance. It falls under the category of gradient boosting machines, and XGBoost is a popular implementation of this technique. It is widely used for solving regression and classification problems. Data scientists favor XGBoost due to its rapid execution speeds, particularly in tasks beyond the core computation (Chen & Guestrin 2016). The selection of the XGBoost classifier in this study arises from a number of advantages the classifier has for flood hazard mapping in the urban context. In urban flood hazard mapping, where numerous factors contribute to susceptibility, the model's ability to incorporate and weigh multiple variables simultaneously is pivotal (Alqahtani et al. 2019). Chen & Guestrin (2016) attest to the classifier's success in capturing complex relationships, establishing its reliability for flood risk assessment in urban environments.
kNN classifier
kNN is a distance-based learning technique that predicts the response of a given point by examining the predominant class among its k-closest neighbors. It revolves around finding the k-nearest objects to a query object based on their similarity in features, and it is particularly robust for flood forecasting compared with other methods as it is easier to interpret and requires low computational power (Ma et al. 2020). Two crucial parameters dictate the category of the query object in the kNN algorithm: the value of k, which determines the number of neighboring points considered, and the distance function used to measure the similarity between the query object and its nearest k neighbors. However, it is important to note that kNN is highly sensitive to outliers (Ramaswamy et al. 2000), a concern that may arise frequently when predicting urban flooding based on rainfall intensities.
kNN's simplicity, interpretability, and adaptability to various spatial conditions of urban areas make them one of the choices in flood hazard mapping (Al-Areeq et al. 2022). In the study of urban flood hazard mapping, where the impact of neighboring areas holds significance, the adaptability of kNN becomes apparent. This adaptability proves especially effective in capturing localized patterns and responding adeptly to variations in flood risk influenced by immediate surroundings.
Performance evaluation
RESULT
Train/test ratio sensitivity analysis
The result of the performance analysis of models on different ratio-based training and testing datasets was presented in the Supplementary material. Investigation on the performance of the models showed that the predictive capability of the ML models was greatly affected by the training/testing ratios, where generally, the ratios from 70/30 to 80/20 showed the best performance for most of the ML models. The study indicated that the increase in the size of the training dataset improved the training performance and made the models more stable. It was revealed that the increase in the size of the training dataset from 30 to 90% improved the testing performance using RF, DT, and kNN algorithms. However, in the case of the XGB algorithm, the increase from 30 to 70% enhanced the testing performance while a further increase to 80%, decreased the testing performance. Thus, based on XGB, a 70/30 ratio was found to be the best ratio that optimizes the testing performance. This ratio was taken for all models for the estimation of flood hazard. The results presented herein showed an effective manner in selecting the appropriate ratios of datasets and the best ML model to predict the flood hazard accurately.
Hyperparameter sensitivity analysis
Models . | Accuracy . | Recall . | Precision . | f1-score . | RMSE . | AUC . | Duration(s) . |
---|---|---|---|---|---|---|---|
DT | 0.73 | 0.69 | 0.75 | 0.69 | 0.21 | 0.74 | 41 |
RF | 0.82 | 0.84 | 0.81 | 0.82 | 0.15 | 0.82 | 85 |
XGB | 0.79 | 0.80 | 0.80 | 0.80 | 0.18 | 0.80 | 112.2 |
kNN | 0.71 | 0.70 | 0.67 | 0.68 | 0.22 | 0.71 | 21 |
Models . | Accuracy . | Recall . | Precision . | f1-score . | RMSE . | AUC . | Duration(s) . |
---|---|---|---|---|---|---|---|
DT | 0.73 | 0.69 | 0.75 | 0.69 | 0.21 | 0.74 | 41 |
RF | 0.82 | 0.84 | 0.81 | 0.82 | 0.15 | 0.82 | 85 |
XGB | 0.79 | 0.80 | 0.80 | 0.80 | 0.18 | 0.80 | 112.2 |
kNN | 0.71 | 0.70 | 0.67 | 0.68 | 0.22 | 0.71 | 21 |
According to the color bar in Figure 5(a), 5(b), and 5(d), the white lines reflect high accuracies near 0.8, 0.82, and 0.8, while the blue lines reflect lower accuracies near 0.55, 0.795, and 0.65, respectively. On the other hand, for XGB in Figure 5(c), the blue lines near 0.80 are the high accuracies and the white lines near 0.68 are the lower accuracies. The most sensitive parameters were max_depth, n_estimators, sub_sample, and n_neighbors for DT, RF, XGB, and kNN, respectively. These algorithms exhibit values range for max_depth, subsample, and n_neighbors within the ranges of 1 to 30, 300 to 1,000, and 20 to 47, respectively. Such consistency suggests a robust performance across a wide spectrum of hyperparameter values, underscoring the versatility and adaptability of these algorithms across diverse modeling scenarios.
On the other hand, XGB shows a unique pattern in the parallel coordinate plot that is identified by blue-black lines. The result reveals a significant hyperparameter sensitivity of the sub_samples within a narrow range of 0.02–0.075. This observation highlights the importance of fine-tuning the subsample parameter in order to achieve optimal model performance, as it has a considerable impact on XGB's performance. When XGB's best-performing algorithms are compared with other algorithms, their narrow range values highlight how precise the hyperparameter settings must be in order to maximize performance. It is evident that RF and XGB algorithms have lower sensitivity with the given hyperparameter spaces than DT and kNN.
Accuracy assessment
Furthermore, the RMSE values for the training data were 0.21 (DT), 0.15 (RF), 0.18 (XGB), and 0.22 (kNN), indicating excellent goodness-of-fit. In the validation analysis, the computational time in seconds in Table 2 shows 41 (DT), 85 (RF), 112.2 (XGB), and 21 (kNN). These values underscore the model's ability to accurately capture and replicate observed flood locations for dataset other than the training dataset. Despite XGBoost and RF showing higher performance metric values, they required longer computation times, potentially due to larger hyperparameters, emphasizing a trade-off between enhanced model performance and increased computational demands.
Variable importance
The relative importance of flood predictor variables is of practical relevance to flood management strategies to deal with allocating and planning limited resources for flood hazard management. In this study, the feature importance was derived using DT, RF, and XGB using a wrapper technique called recursive feature elimination with cross-validation (RFECV) to refine the model and enhance efficiency by focusing only on variables that significantly impact the output. According to the result from the DT and RF algorithms, the distance to the river was found to be the most important factor in flood inundation in Bahir Dar city, with contribution values of 50 and 35%, respectively (Table 3). The second-most important variable was elevation, with contribution percentages of 16 and 27%, respectively. Distance to the drainage system was identified as the third important variable, with a contribution of 12.0 and 13.0% for DT and RF, respectively. However, the results for XGB, the best-performing algorithm, differed from DT and RF. In the XGB algorithm, the most important variables were distance to drainage, elevation, and distance to the river, with percentage contributions of 13, 12.5, and 11.5%, respectively. The curve number and percentage of imperviousness, with contributions above 8%, were the fourth and fifth predictors of flood hazard in Bahir Dar City, which are also important variables that needed to be included in the prediction of flood hazard in the urban area context. To further elucidate the case, we conducted an analysis of the impact of the number of features on the accuracy of the models and the result is presented in the Supplementary material. The result further suggested that CNII and percentage imperviousness (the fourth and fifth parameters) kept the model particularly the XGB algorithm performing well beyond which the performance decreased. Hence, based on the result, the models' performance was tested using only the identified important variables including CNII and percentage of imperviousness. Otherwise, a further increase in the number of variables decreases the performance of the models.
Model . | DR . | Elv. . | DD . | S . | CNII . | PC1 . | PC2 . | %Imp. . | TPI . | A . | SPI . | TRI . | LULC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DT | 50.0 | 16.0 | 12.0 | 12.0 | 4.0 | 3.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
RF | 35.0 | 27.0 | 13.0 | 7.0 | 5.0 | 4.0 | 2.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | 1.0 |
XGB | 11.5 | 12.5 | 13.0 | 6.0 | 9.0 | 6.5 | 7.0 | 8.5 | 6.0 | 5.0 | 5.0 | 4.0 | 4.0 |
Model . | DR . | Elv. . | DD . | S . | CNII . | PC1 . | PC2 . | %Imp. . | TPI . | A . | SPI . | TRI . | LULC . |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DT | 50.0 | 16.0 | 12.0 | 12.0 | 4.0 | 3.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
RF | 35.0 | 27.0 | 13.0 | 7.0 | 5.0 | 4.0 | 2.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | 1.0 |
XGB | 11.5 | 12.5 | 13.0 | 6.0 | 9.0 | 6.5 | 7.0 | 8.5 | 6.0 | 5.0 | 5.0 | 4.0 | 4.0 |
The feature selection results indicate the areas close to the river are more likely to be affected by floods is very reasonable in the study area. This could be true for normal river floods and also for flash floods in case of heavy rains, especially in the late rainy season, during which the soil is saturated. The study area is located at a lower elevation than the surrounding area and characterized by flat slope, elevation, and slope, to be the second and fourth important factors, respectively, in affecting flood occurrence, are also significant as they influence surface runoff, volume, and velocity of flow. In such areas, there is more accumulation than outflow due to gentle topography resulting in the rapid rise of the flood water level within a short time during heavy rain. The distance to drainage systems to be found is the third flooding control variable in our study area, owing to various attributes associated with these systems. Within this area, drainage systems face challenges such as blockage from solid waste due to inadequate waste management, insufficient connectivity, and the absence of suitable drainage infrastructure in flat and gently sloping terrain. Consequently, proximity to drainage systems often leads to recurrent flooding issues. Areas with a larger proportion of green cover experienced less flooding. On the other hand, densely commercial and residential areas were among the most frequently flooded areas, particularly along rivers and drainage lines.
In general, the limitations of this study include the challenges of obtaining high-resolution and comprehensive data, particularly spatial observations from aerial sources, during and immediately after flooding events. The lack of such data may have affected the accuracy, precision, and reliability. Nonetheless, the findings of this study provide valuable insights into urban flood hazard modeling, visualization, and representation for efficient interpretation in urban areas. The utilization of big data and the successful prediction of potential hazardous zones can inform cost-effective land use planning, allowing for preliminary assessments of flood risks while staying within budget constraints.
Flood hazard mapping
This study aimed to map urban flash flood inundation areas in Bahir Dar using four ML algorithms: DT, RF, XGB, and kNN. Thirteen thematic layers of flood-related factors, including elevation, slope, curvature, aspect, SPI, distance to the river, distance to drainage, land use, percentage of imperviousness, and curve number, were overlaid with 70% of the flooded points in the training set.
Hazard class . | DT . | RF . | XGB . | kNN . | ||||
---|---|---|---|---|---|---|---|---|
Area (ha) . | Percentage . | Area (ha) . | Percentage . | Area (ha) . | Percentage . | Area (ha) . | Percentage . | |
Very low | 1,087.2 | 17.34 | 1,626.7 | 25.94 | 1,025.1 | 16.35 | 725.4 | 11.57 |
Low | 1,317.6 | 21.02 | 1,525.9 | 24.34 | 1,121.3 | 17.90 | 811.1 | 12.94 |
Moderate | 1,322.5 | 21.10 | 1,450.0 | 23.13 | 1,463.5 | 23.34 | 1,149.1 | 18.33 |
High | 1,431.2 | 22.83 | 996.4 | 15.90 | 1,476.1 | 23.50 | 1,400.5 | 22.34 |
Very high | 1,110.6 | 17.72 | 670.9 | 10.70 | 1,183.8 | 18.90 | 2,183.7 | 34.80 |
Hazard class . | DT . | RF . | XGB . | kNN . | ||||
---|---|---|---|---|---|---|---|---|
Area (ha) . | Percentage . | Area (ha) . | Percentage . | Area (ha) . | Percentage . | Area (ha) . | Percentage . | |
Very low | 1,087.2 | 17.34 | 1,626.7 | 25.94 | 1,025.1 | 16.35 | 725.4 | 11.57 |
Low | 1,317.6 | 21.02 | 1,525.9 | 24.34 | 1,121.3 | 17.90 | 811.1 | 12.94 |
Moderate | 1,322.5 | 21.10 | 1,450.0 | 23.13 | 1,463.5 | 23.34 | 1,149.1 | 18.33 |
High | 1,431.2 | 22.83 | 996.4 | 15.90 | 1,476.1 | 23.50 | 1,400.5 | 22.34 |
Very high | 1,110.6 | 17.72 | 670.9 | 10.70 | 1,183.8 | 18.90 | 2,183.7 | 34.80 |
These results indicate significant differences in the distribution of flood hazard classes among the four ML algorithms used in this study. Notably, the kNN algorithm identified a substantial 34.80% of the study area as a very high flood hazard zone, whereas the RF algorithm classified only 10.70% of the area as such. Regarding the very low hazard classes, the RF algorithm classified 25.59% of the study area as a very low flood hazard zone, whereas the kNN algorithm classified only 11.57% of the area into the same category. In contrast, the DT and XGB methods exhibited a relatively balanced distribution of areas across different hazard zones. The spatial distribution of hazard classes in the study area did not show a specific directional pattern across all the methods utilized.
CONCLUSION
This study successfully explores ML techniques as a viable and precise solution for mapping flood hazards within urban areas. By using readily available geo-environmental data, ML offers a cost-effective and accessible tool for enhancing flood preventive initiatives.
Evaluation of the ML algorithms including DT, RF, XGB, and kNN by the various performance metrics such as AUC and f1-score underscored the efficacy of RF and XGBoost exhibiting marginally superior performance compared with other classifiers. Despite variations in computational time, these algorithms demonstrated notable predictive proficiency, indicating a balance between computational resources and predictive accuracy. Moreover, the sensitivity analysis revealed that RF and XGB are more stable ML across a wide range of hyperparameter configurations in the study area. The findings provide valuable insights into urban flood hazard modeling, facilitating efficient interpretation in urban areas.
The study's exploration of four distinct ML algorithms – DT, RF, XGBoost, and kNN – revealed diverse flood hazard distributions. Notably, each algorithm identified different proportions of hazard zones, underscoring the importance of algorithm selection in flood hazard prediction. Despite variations, all models achieved high predictive performance, with RF and XGBoost exhibiting slightly higher AUC values, reflecting their superiority in understanding of flood dynamics in the urban context.
Uncovering the critical factors governing flood hazard, the study highlighted the significance of variables were as distance to the river, elevation, slope, and distance to the drainage system as indicated by most algorithms. Furthermore, as demonstrated by the XGB algorithm, curve number and percentage imperviousness were also shown to have a considerably significant effect on flood hazard. These findings emphasize the need for targeted mitigation efforts in frequently flooded areas, focusing on factors influencing inundation dynamics.
However, the study's reliance on a limited dataset poses a significant challenge, impending algorithm training and validation processes. Future endeavors should prioritize data expansion to improve model generalization and reliability, particularly by incorporating detailed inundation records with location, flood depth, and date. This holistic dataset would facilitate a more robust evaluation of the rainfall–flood relationship, enriching our comprehension of urban flood dynamics. Furthermore, advancements in remote sensing and geographic information system (GIS) technologies hold immense potential for enhancing data collection and modeling accuracy. Examples include the collection of high-resolution data and the emergence of LIDAR technology that would help acquire high-resolution DEM and digital surface model (DSM). The presence of such data would allow a more accurate representation of flow accumulation, and drainage areas particularly in urban areas that generally have more complex and irregular topography with buildings, drainage networks, and other critical infrastructures. Despite their high cost, such data highly improve data collection strategy and model accuracy.
In conclusion, this study underscores the potential of ML algorithms in analyzing urban flash flood hazards based on various geo-environmental factors. While further research to refine our understanding of flood dynamics and contributing factors by integrating comprehensive flood data is essential for advancing accuracy in flood hazard mapping, the current study provided an important insight into important factors controlling flooding. Ultimately, this enhanced understanding will inform effective flood management practices in urban contexts, mitigating risks and safeguarding communities from the devastating impacts of flash floods.
ACKNOWELDGEMNTS
The authors would like to acknowledge the contribution of the project titled The Study & Detailed Designs for Improved Drainage Systems in Flood-Prone Areas of Bahir Dar, which provided some of the background information and the observations in the article, including the drainage inventory. The authors also declare that they independently selected the 464 data points and conducted regular flood occurrence observations to further advance this particular study.
AUTHOR CONTRIBUTIONS
E.S.L.: machine learning expert, contributed to the investigation, methodology, supervision, communication, resources, data curation, and project management. E.S.L., W.A., and M.A.M.: contributed to the original draft preparation, data collection, formal data analysis, investigation, critical review, and final revisions. E.S.L., W.A., F.A.Z., S.T.A., and M.A.M.: contributed to supervision, validation, revision, discussion, resources, improvement, and advice. All authors have read and agreed to the published version of the manuscript.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.