Flooding in remote regions presents significant challenges due to data scarcity, complicating impact assessment and mitigation efforts. This research delineates an integrated methodology for quantifying flood impacts in such contexts. By leveraging machine-learning algorithms, Sentinel-1 synthetic aperture radar (SAR) imagery was combined with digital elevation model data and river proximity metrics to predict and accurately demarcate flood extents. Geographic information systems overlay techniques were then employed for spatial analysis of the floods’ impacts on population and infrastructural assets. The methodology was applied in a case study in Ngabang District, Indonesia, demonstrating its utility. Analysis using decision tree, random forest (RF), and gradient boosting machine models provided critical insights into flood prediction factors. The RF model was chosen as the best, successfully identified flood-prone regions, achieving an accuracy of 0.94 and a Kappa of 0.87 on the testing data, demonstrating its robustness. The flood map showed significant impacts, affecting 373.81 hectares, 10,706 people, 1,500 buildings, and 15 km of roads. This study highlights the importance of proximity, elevation, SAR imagery, and iterative model improvements in flood prediction, offering valuable insights for flood management and mitigation efforts in data-scarce regions.

  • It presents a novel method using machine learning algorithms, synthetic aperture radar imagery, digital elevation model data, and river proximity metrics to predict flood extents.

  • Advanced models (decision tree, random forest, and gradient boosting machine) are used.

  • Geographic information systems overlay techniques offer a detailed spatial analysis of flood impacts.

  • Application in remote areas to identify flood extent and its significant impacts.

  • This study provides valuable tools for regions that have limited data.

Flooding is a recurrent and devastating natural disaster that affects millions of people each year, causing significant economic losses and numerous fatalities (Olanrewaju et al. 2019). Mitigation efforts focus primarily on urban areas because of their dense populations and extensive infrastructure, which are highly vulnerable to flood damage. However, remote areas are not exempt from the impacts of flooding and often suffer greatly. In fact, they face challenges in both impact assessment and disaster mitigation due to the lack of comprehensive data. Hence, flood management and response are particularly difficult in these areas (Manyangadze et al. 2022; Iqbal & Nazir 2023).

The existing body of literature shows that synthetic aperture radar (SAR) imagery has become crucial in flood mapping thanks to its ability to capture high-resolution images in various weather conditions, including cloud cover and nighttime (Panahi et al. 2022; Riazi et al. 2023). This capability is vital for effective flood management and response strategies. However, relying solely on SAR data can result in artifacts, so additional data is needed for accurate flood extent delineation.

Meanwhile, machine learning (ML) can improve the accuracy and timeliness of flood mapping by analyzing complex hydrological data (Elkhrachy 2022; Soria-Ruiz et al. 2022), including satellite imagery and real-time sensor data. ML models can also enhance flood mapping precision by identifying high-risk areas with greater detail and accuracy (Sampurno et al. 2023). As such, stakeholders can make more informed decisions on flood risk management and mitigation.

Likewise, geographic information systems (GIS) are indispensable in disaster management as they integrate diverse spatial data for a holistic analysis of flood impacts (Tomaszewski 2020). GIS facilitates real-time monitoring and decision-making during disaster events, enhancing the efficiency and effectiveness of response efforts. Combined with SAR imagery and ML algorithms, such data can generate a precise flood hazard map, which offers detailed insights into the impacts on communities and infrastructure (Elkhrachy 2022; Soria-Ruiz et al. 2022; Riazi et al. 2023).

Integrating SAR, GIS, and ML can optimize flood mapping and risk management (Amiri et al. 2024) for three reasons. First, SAR can enhance flood mapping by capturing high-resolution images in all weather conditions with minimum revisit time, which allows for timely data acquisition during flood events (Tripathi et al. 2021; Islam & Meng 2022). Second, the combination of SAR data and GIS spatial analysis will allow for more accurate monitoring and prediction of future flood events, thereby reducing potential human and economic losses (Khosravi et al. 2019). Third, ML further enhances this capability by analyzing large datasets from SAR and GIS to uncover patterns and relationships that may not be evident through traditional methods. This data-driven flood mapping will result in better flood risk management and impact mitigation (Nachappa et al. 2020; Shahabi et al. 2020) as the robust framework offers a more comprehensive and accurate tool for decision-makers in flood-prone areas.

However, it should be noted that such advancements in flood mapping may not be as effective without sufficient data. In remote areas, the critical gap in accurately delineating flood extents is limited data availability. Therefore, this study addresses this gap by integrating SAR imagery, digital elevation models (DEMs), and proximity to rivers as predictors, as well as utilizing ML algorithms to map flood extents. The novelty of this work lies in utilizing these integrated predictors in a case study in the Ngabang District, Indonesia, a region that exemplifies the challenges faced by many remote communities. This approach aims to improve the accuracy of flood extent mapping while providing actionable insights that can enhance disaster management strategies and allow for more effective responses in vulnerable areas.

Study area

Ngabang District in Landak Regency, West Kalimantan, Indonesia, was selected as a case study in this study (Figure 1). The district spans 1,148.09 km² and comprises 19 villages with a population of 79,292 people in 2022 (Prahara 2023). The area is predominantly lowland, with elevations ranging from 50 to 250 m above sea level (Pemerintah Kecamatan Ngabang 2024). The Landak River passes through the area, making it prone to flooding during rainy seasons. Therefore, mitigation measures are essential to reduce the impact of flooding on infrastructure.
Figure 1

The case study centers on a remote ROI in the western part of Kalimantan Island. The background map was retrieved from OpenStreetMap (2022).

Figure 1

The case study centers on a remote ROI in the western part of Kalimantan Island. The background map was retrieved from OpenStreetMap (2022).

Close modal

Data acquisition

This study utilizes Sentinel-1 SAR GRD satellite imagery from the Google Earth Engine (GEE) platform, specifically the Sentinel-1 C band interferometric wide swath mode GRD datasets (https://code.earthengine.google.com/). The SAR imagery for this case study was taken on 8 January 2024, aligning with the flood event timeframe between 6 and 15 January 2024. Extensive preprocessing was performed, including updating orbit metadata, noise elimination, radiometric calibration, orthorectification, and converting backscatter coefficients to decibels (Filipponi 2019). The imagery was then filtered and cropped to the region of interest (ROI), focusing on periods during flood events. Advanced refinement reduced SAR noise, as shown in Figure 2. Sentinel-1 SAR imagery with vertical–vertical (VV) and vertical–horizontal (VH) polarization bands is crucial for flood mapping, as it distinguishes water surfaces (low backscatter) from non-water surfaces like urban areas and vegetation (high backscatter) (Sherpa & Shirzaei 2022). Meanwhile, the in situ data were collected on 11 January 2024 (Figure 2).
Figure 2

The smoothed VH band of Sentinel-1 SAR satellite imagery, with validation points representing inundated (blue) and dry areas (red) during the flood event in January 2024.

Figure 2

The smoothed VH band of Sentinel-1 SAR satellite imagery, with validation points representing inundated (blue) and dry areas (red) during the flood event in January 2024.

Close modal

In addition to the SAR imagery, we incorporated predictors derived from DEMNAS' DEM data (Badan Informasi Geospasial 2018) and proximity metrics from the Landak River and its branches. The inclusion of DEM data provides essential topographic context, enhancing the accuracy of flood mapping by accounting for elevation-related variations in water flow and accumulation. Similarly, proximity to the Landak River and its branches helps identify areas at higher risk of flooding, further refining our predictive model. The combination of satellite imagery, DEM data, and proximity metrics offers a comprehensive approach to flood risk assessment in the region. The details of the data utilized in this study are presented in Table 1.

Table 1

Data types and sources used for flood mapping

NoData typeSource
DEM DEMNAS (Badan Informasi Geospasial 2018
Proximity from River Calculated from waterway map (OpenStreetMap 2022). 
Sentinel-1 SAR GRD Accessed and processed using the GEE platform (https://code.earthengine.google.com/) (ESA 2024
Validation points In situ data observations during the January 2024 flood event 
NoData typeSource
DEM DEMNAS (Badan Informasi Geospasial 2018
Proximity from River Calculated from waterway map (OpenStreetMap 2022). 
Sentinel-1 SAR GRD Accessed and processed using the GEE platform (https://code.earthengine.google.com/) (ESA 2024
Validation points In situ data observations during the January 2024 flood event 

ML model

To transform the SAR backscatter characteristics into precise flood extent maps, a comparative approach using ML algorithms was adopted. The SAR backscatter values, comprising both the VV and VH bands, served as input features for the models. These specific bands allow the models to capture detailed surface properties and variations in water presence. DEM and the river's proximity data were also included to account for topographical influences and hydrological connectivity. These additional predictors were crucial for enhancing the model's ability to delineate flood-prone areas accurately.

The study employed three key ML algorithms: decision tree (DT), random forest (RF), and gradient boosting machine (GBM) (Felix & Sasipraba 2019; Panahi et al. 2022; Sampurno et al. 2023). DT is known for its simplicity and interpretability, making it easy to understand the decision-making process. However, DT models are prone to overfitting, particularly when dealing with complex datasets. To mitigate this issue, RF was utilized, which enhances model accuracy and reduces overfitting by aggregating the results of multiple decision trees. RF's ensemble approach creates a more stable and reliable model. GBM, on the other hand, offers robust performance by iteratively correcting errors from previous trees, thus enhancing overall model accuracy (Hastie et al. 2009).

The target variable for these algorithms was the flood extent, which was critical for assessing each model's effectiveness in identifying flooded areas. The models were trained using data from a single flood event in January 2024. This event provided SAR backscatter data, including VV and VH bands, DEM, and proximity data, correlated with verified flood extents derived from in situ real-time observations. The data were split into 80% for training and 20% for testing to ensure a robust model performance evaluation. This training process involved extensive data pre-processing and validation, ensuring the input features' reliability and the models' generalizability. By leveraging this carefully prepared data, the models could learn from the observed flooding patterns, significantly enhancing their predictive capabilities for future flood events.

The performance of each model was subsequently evaluated based on the accuracy and the Kappa coefficient, among other metrics, to determine the most effective algorithm for accurate flood mapping using SAR data. Accuracy is a straightforward metric calculated as the ratio of correctly predicted instances to the total instances in the dataset (Liu et al. 2014). However, accuracy can be misleading in the context of imbalanced datasets. Therefore, the Kappa coefficient is used as a complement error measure. This statistic compares the observed accuracy against the accuracy to be expected by chance (Liu et al. 2014). In flood mapping scenarios, the differentiation between water and non-water classes is paramount, so the two metrics combined offer a more holistic view of the ML model's robustness and reliability. High values of accuracy and Kappa give more confidence in the model's ability to delineate flood extents with precision (Congalton & Green 2008).

Flood impact assessment

After meticulously mapping flood extents in the previous stage, the next stage assessed its impact on existing infrastructure, specifically targeting buildings and road networks. This process provided a detailed visual and quantitative analysis of the potential damages and disruptions. The assessment also incorporated demographic data to estimate the impact on the population accurately. The impact was calculated using the overlay technique, which involved layering the flood extent maps over the infrastructure and population datasets to identify intersections and areas of overlap. As such, the tool could precisely determine which buildings, roads, and population clusters would fall within the flood-affected zones. The number of affected buildings, road lengths, and populations within the flood-affected zones was then calculated by summing the respective values from the intersection map. This approach facilitated the identification of high-risk zones, hence the formulation of targeted evacuation plans and resource allocation. The assessment used the InaSAFE plugin within the quantum GIS (QGIS) environment (InaSAFE 2024). The QGIS tool is renowned for integrating hazard data into socio-economic datasets to evaluate the potential impact of flood events on critical infrastructure and human populations. Infrastructure data were sourced from OpenStreetMap (OpenStreetMap 2022), and population data were derived from the GHSL Data Package 2022 (Schiavina et al. 2022).

ML model performance

The DT model (Figure 3) provided an interpretable structure for understanding the factors influencing flood extent. The root node of the model splits the data based on proximity, with a threshold of 170 m. This threshold was determined in the model training. If the proximity was greater than or equal to 170 m, the model predicted a no-flood event with 66% confidence. For proximity values less than 170 m, the model split based on the DEM values. Specifically, if the DEM value was greater than or equal to 20 m, the model predicted a no-flood event with 51% confidence. In contrast, the model's prediction varied if the DEM value was less than 20 m and the proximity was between 170 and 248 m. The flood would occur with different probabilities. Overall, the DT model highlighted the critical role of proximity and elevation in flood prediction.
Figure 3

The DT final model.

Figure 3

The DT final model.

Close modal
The RF model (Figure 4) provided valuable insights into the importance of different features in predicting flood extent. Proximity emerged as the most significant feature, indicating that the most critical predictors of flood-prone areas are the distance to rivers, streams, or other water bodies. The closer an area is to a water body, the more likely it will be affected by overflow and subsequent flooding (Al-Omari et al. 2024). This is particularly evident in lowland areas near drainage networks, where the topography facilitates water accumulation and movement, leading to a very high flood hazard (Patrikaki et al. 2018). The DEM was the next most important feature, suggesting that elevation significantly influences flood extent, with lower areas being more susceptible to flooding. The VV band captured variations in surface properties, which could indicate water presence, contributing to the model's accuracy. The VH band also contributed to refining the model's predictions by providing additional information on surface scattering properties, which helped distinguish between flooded and non-flooded areas. Nonetheless, these contributions were less impactful than proximity and DEM. Furthermore, the mean decrease in the Gini index highlighted that each feature played a distinct role in the model, with proximity and DEM being particularly influential. The RF model's ensemble approach, which considered multiple decision trees, helped reduce overfitting and improved the accuracy of predictions by integrating these diverse data sources.
Figure 4

Feature importance from RF model.

Figure 4

Feature importance from RF model.

Close modal
The GBM model (Figure 5) demonstrated its strength in improving predictive accuracy through iterative boosting. The cross-validation accuracy fluctuated with the number of boosting iterations and varied based on the maximum tree depth. The highest accuracy, around 92%, was achieved with moderate boosting iterations and tree depths. However, the accuracy did not show a consistent trend with increasing iterations, indicating that the model had not yet attained an optimal value. The trend underscores the importance of selecting an optimal number of iterations and tree depth to avoid overfitting or underfitting. The observation suggests that further tuning of these hyperparameters is necessary to stabilize accuracy and fully capitalize on the GBM model's potential for flood extent prediction. This iterative correction of errors in GBM contributes to its robustness, but careful tuning is crucial for achieving the best predictive performance.
Figure 5

The GBM tuning result.

Figure 5

The GBM tuning result.

Close modal
The performance metrics for the three models were evaluated on both training and testing datasets (Figure 6). The DT model achieved an accuracy of 0.94 and a Kappa of 0.87 on the training data, but its performance decreased to an accuracy of 0.88 and a Kappa of 0.76 on the testing data, indicating some overfitting. The RF model showed excellent performance with accuracy and Kappa of 1.00 on the training data and slightly reduced but still high values of 0.94 and 0.87, respectively, on the testing data, demonstrating its robustness. The GBM model showed training accuracy and Kappa of 0.95 and 0.90, respectively, with testing values maintaining strong performance at 0.94 for both accuracy and Kappa, showing its consistency and reliability.
Figure 6

Performance metrics of the ML models.

Figure 6

Performance metrics of the ML models.

Close modal

Flood extent and its impact

The flood map as the output of the best model is illustrated in Figure 7, which shows a stark distinction between flooded and non-flooded areas. The flood event in the ROI covered an area of approximately 373.81 hectares, leaving approximately 2,156.4 hectares of land unaffected and dry. It is important to note that this flood event is particularly salient due to its proximity to the riverbanks, where population density is high. In this case, the impact of the flood would be severe as it directly affects a substantial portion of the community. Although a significant portion of the region remains unaffected by floodwaters, most of the population lives in flood-prone areas.
Figure 7

The map of flood extent and its impact.

Figure 7

The map of flood extent and its impact.

Close modal

Furthermore, we investigated the impact of the flood event on the ROI (Table 2). The event directly affected a subset of this population, comprising 10,706 persons out of a total population of 79,292. As for the built environment, 1,500 out of 11,100 buildings in this region were affected by the event, signifying considerable disruption. The road infrastructure within the ROI, which plays a crucial role in transportation and connectivity, was also severely impacted. The analysis shows that 15 km of roads out of 141 km were disrupted. The results of this analysis provide an understanding of the scale and scope of the event on the population and the built environment within this specific region.

Table 2

Estimation of the flood impact

NoExposureTypeAffectedNot affectedTotal
Road Motorway 491 7,130 7,621 
Local 1,454 16,694 18,148 
Path 1,305 2,822 4,127 
Secondary 3,044 3,044 
Other 11,882 96,188 108,070 
Building Place of worship 
Education 
Residential 1,486 9,567 11,053 
Population  10,706 68,586 79,292 
NoExposureTypeAffectedNot affectedTotal
Road Motorway 491 7,130 7,621 
Local 1,454 16,694 18,148 
Path 1,305 2,822 4,127 
Secondary 3,044 3,044 
Other 11,882 96,188 108,070 
Building Place of worship 
Education 
Residential 1,486 9,567 11,053 
Population  10,706 68,586 79,292 

The comparative analysis of the DT, RF, and GBM models underscores the strengths and limitations of each algorithm in flood extent prediction. The DT model's simplicity and interpretability make it a valuable tool for understanding the influence of individual predictors (Ludwig et al. 2017), such as proximity and elevation. However, its tendency to overfit, especially with complex datasets, limits its predictive power, as evidenced by decreased performance from training to testing datasets (Stiglic et al. 2012).

The RF model mitigates overfitting through its ensemble approach, resulting in higher accuracy and stability (Zhang & Wang 2021; Sun et al. 2024). The feature importance analysis revealed that proximity and DEM are crucial predictors of flood extent, aligning with hydrological knowledge. Albeit less critical, the VV and VH bands contributed additional detail to the model. The RF model's performance metrics, with minimal drop from training to testing datasets, highlight its robustness and reliability in flood prediction tasks. This consistently high performance makes RF the best choice among the evaluated models (Chen et al. 2018; Rodriguez-Galiano et al. 2018; Song et al. 2021).

The GBM model excelled in accuracy by iteratively correcting errors, but the fluctuation in accuracy with varying iterations and tree depths highlights the need for careful parameter tuning. As the number of iterations increases and trees grow deeper, the model becomes more complex, which can lead to overfitting or underfitting depending on the dataset and the specific configuration applied (Kiatkarun & Phunchongharn 2020; Xia et al. 2021; Mwita et al. 2023). However, its robust performance in both training and testing datasets, consistent accuracy, and Kappa values indicate its potential for flood extent prediction when properly utilized. The GBM's ability to maintain high-performance metrics across datasets suggests it can effectively make generalizations from training data to unseen data, making it a strong candidate for practical flood prediction applications (Felix & Sasipraba 2019).

In conclusion, while the DT model offers interpretability and the GBM model provides robust performance through iterative improvement, the RF model emerged as the most effective algorithm for flood extent prediction in this study area. Its ability to handle overfitting, combined with high accuracy and stability across datasets, makes it the most reliable choice for effective flood management and mitigation strategies (Kumar et al. 2021; Sun et al. 2023). RF's high accuracy and stability have been consistently demonstrated in numerous studies, reinforcing its reliability for applications requiring dependable predictions, such as flood management and mitigation (Bharathidason & Jothi Venkataeswaran 2014; Dheenadayalan et al. 2016).

However, while RF is generally reliable, its performance can be affected by the presence of noisy trees or correlated decision trees within the ensemble, which can impact classification accuracy (Li et al. 2010). Consequently, caution is advised when applying this model to other remote areas. Considering the unique environmental and data-specific factors that may influence its performance is crucial.

This study demonstrates that ML approaches using Sentinel-1 SAR, DEM, and river proximity data can effectively map floods in Ngabang District, Indonesia. Analyses using DT, RF, and GBM models provided critical insights into flood prediction factors. The RF model, chosen as the best, successfully identified flood-prone regions. While the DT model experienced some overfitting, the RF and GBM models maintained high accuracy and reliability. The map showed that the flood significantly impacted 373.81 hectares of land, 10,706 people, 1,500 buildings, and 15 kilometers of roads. This analysis highlights the importance of proximity, elevation, SAR imagery, and iterative model improvements in flood prediction, offering valuable insights for flood management and mitigation efforts.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Al-Omari
A. A.
,
Abdalla
K. M.
,
Shatnawi
N. N.
,
Lagaros
N. D.
,
Shbeeb
N. I.
&
Istrati
D.
(
2024
)
Utilizing remote sensing and GIS techniques for flood hazard mapping and risk assessment
,
Civil Engineering Journal
,
10
(
5
),
1423
1436
.
Badan Informasi Geospasial
(
2018
).
DEMNAS: Seamless Digital Elevation Model (DEM) dan Batimetri Nasional
.
Available at: https://tanahair.indonesia.go.id/demnas/#/ (Accessed: 12 July 2024)
.
Bharathidason
S.
&
Jothi Venkataeswaran
C.
(
2014
)
Improving classification accuracy based on random forest model with uncorrelated high performing trees
,
International Journal of Computer Applications
,
101
(
13
),
26
30
.
Congalton
R. G.
&
Green
K.
(
2008
)
Assessing the Accuracy of Remotely Sensed Data: Principles and Practices
, 2nd ed.
Boca Raton
:
CRC Press
.
Dheenadayalan
K.
,
Srinivasaraghavan
G.
&
Muralidhara
V. N.
(
2016
) ‘
Pruning a random forest by learning a learning algorithm’
.
Springer
, pp.
516
529
.
https://doi.org/10.1007/978-3-319-41920-6_41
.
European Space Agency (ESA)
(
2024
).
Sentinel-1 SAR GRD [Data set
]
.
Copernicus Open Access Hub, Paris
.
Felix
A. Y.
&
Sasipraba
T.
(
2019
). '
Flood detection using gradient boost machine learning approach
',
IEEE 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE)
, pp.
779
783
.
Filipponi
F.
(
2019
)
Sentinel-1 GRD pre-processing workflow
,
Proceedings.
,
18
(
1
),
11
.
Hastie
T.
,
Tibshirani
R.
&
Friedman
J. H.
(
2009
)
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
, 2nd ed.
New York
:
Springer
.
InaSAFE
(
2024
).
Available at: http://inasafe.org/ (Accessed: 16 January 2024)
.
Islam
M. T.
&
Meng
Q.
(
2022
)
An exploratory study of Sentinel-1 SAR for rapid urban flood mapping on Google Earth Engine
,
International Journal of Applied Earth Observation and Geoinformation
,
113
,
103002
.
Khosravi
K.
,
Shahabi
H.
,
Pham
B. T.
,
Adamowski
J.
,
Shirzadi
A.
,
Pradhan
B.
,
Dou
J.
,
Ly
H.
,
Grof
G.
,
Ho
H. L.
,
Hong
H.
,
Chapi
K.
&
Prakash
I.
(
2019
)
A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods
,
Journal of Hydrology
,
573
,
311
323
.
Kiatkarun
K.
&
Phunchongharn
P.
(
2020
) ‘
Automatic hyper-parameter tuning for gradient boosting Machine. institute of electrical electronics engineers
’,
2020 1st International Conference on Big Data Analytics and Practices (IBDAP)
.
https://doi.org/10.1109/ibdap50342.2020.9245609
.
Kumar, K. V., Kumari, P., Chatterjee, A. & Mohapatra, D. P. (2021)
Software Fault Prediction Using Random Forests
.
In: Mishra, D., Buyya, R., Mohapatra, P., Patnaik, S. (eds) Intelligent and Cloud Computing. Smart Innovation, Systems and Technologies, vol 194. Springer, Singapore. https://doi.org/10.1007/978-981-15-5971-6_10.
Li
H. B.
,
Ding
H. W.
,
Wang
W.
&
Dong
J.
(
2010
) '
Trees weighting random forest method for classifying high-dimensional noisy data
',
Institute of Electrical Electronics Engineers
.
https://doi.org/10.1109/icebe.2010.99
.
Liu
Y.
,
Zhou
Y.
,
Wen
S.
&
Tang
C.
(
2014
)
A strategy on selecting performance metrics for classifier evaluation
,
International Journal of Mobile Computing and Multimedia Communications (IJMCMC)
,
6
(
4
),
20
35
.
Ludwig
S. A.
,
Picek
S.
&
Jakobovic
D.
(
2017
)
Classification of Cancer Data: Analyzing Gene Expression Data Using A Fuzzy Decision Tree Algorithm
.
Cham, Switzerland: Springer
, pp.
327
347
.
https://doi.org/10.1007/978-3-319-65455-3_13
.
Manyangadze
T.
,
Mavhura
E.
,
Mudavanhu
C.
&
Pedzisai
E.
(
2022
)
Flood inundation mapping in data-scarce areas: A case of Mbire District, Zimbabwe
,
Geo: Geography and Environment
,
9
,
e105
.
Mwita
M.
,
Elikana Sam
A.
,
Mbelwa
J.
&
Agbinya
J.
(
2023
)
The effect of hyperparameter optimization on the estimation of performance metrics in network traffic prediction using the gradient boosting machine model
,
Engineering, Technology & Applied Science Research
,
13
(
3
),
10714
10720
.
Nachappa
T. G.
,
Piralilou
S. T.
,
Gholamnia
K.
,
Ghorbanzadeh
O.
,
Rahmati
O.
&
Blaschke
T.
(
2020
)
Flood susceptibility mapping with machine learning, multi-criteria decision analysis and ensemble using Dempster Shafer theory
,
Journal of Hydrology
,
590
,
125275
.
Olanrewaju
C. C.
,
Chitakira
M.
,
Olanrewaju
O. A.
&
Louw
E.
(
2019
)
Impacts of flood disasters in Nigeria: A critical evaluation of health implications and management
,
Jàmbá: Journal of Disaster Risk Studies
,
11
(
1
),
1
9
.
OpenStreetMap contributors
(
2022
).
‘Planet dump’. Available at: https://planet.osm.org: https://www.openstreetmap.org (Accessed: 18 January 2024)
.
Panahi
M.
,
Rahmati
O.
,
Kalantari
Z.
,
Darabi
H.
,
Rezaie
F.
,
Moghaddam
D. D.
,
Ferreira
C. S. S.
,
Foody
G.
,
Aliramaee
R.
,
Bateni
S. M.
,
Lee
C.
&
Lee
S.
(
2022
)
Large-scale dynamic flood monitoring in an arid-zone floodplain using SAR data and hybrid machine-learning models
,
Journal of Hydrology
,
611
,
128001
.
Patrikaki
O.
,
Kougias
I.
,
Theodossiou
N.
,
Voudouris
K.
,
Kazakis
N.
&
Patsialis
T.
(
2018
)
Assessing flood hazard at river basin scale with an index-based approach: The case of mouriki, Greece
,
Geosciences
,
8
(
2
),
50
.
Pemerintah Kecamatan Ngabang
(
2024
).
Geografis Kecamatan Ngabang
.
Ngabang, Indonesia: Pemerintah Kecamatan Ngabang. Available at: https://kecamatanngabang.landakkab.go.id/geografis/ (Accessed: 2 July 2024)
.
Prahara
G.
(
2023
)
Kecamatan Ngabang Dalam Angka 2023
.
Ngabang
:
Badan Pusat Statistik Kabupaten Landak
.
Riazi
M.
,
Khosravi
K.
,
Shahedi
K.
,
Ahmad
S.
,
Jun
C.
,
Bateni
S. M.
&
Kazakis
N.
(
2023
)
Enhancing flood susceptibility modeling using multi-temporal SAR images, CHIRPS data, and hybrid machine learning algorithms
,
Science of The Total Environment
,
871
,
162066
.
Rodriguez-Galiano
V. F.
,
Luque-Espinar
J. A.
,
Chica-Olmo
M.
&
Mendes
M. P.
(
2018
)
Feature selection approaches for predictive modelling of groundwater nitrate pollution: An evaluation of filters, embedded and wrapper methods
,
Science of The Total Environment
,
624
,
661
672
.
Schiavina
M.
,
Melchiorri
M.
,
Pesaresi
M.
,
Politis
P.
,
Freire
S.
,
Maffenini
L.
,
Florio
P.
,
Ehrlich
D.
,
Goch
K.
,
Tommasi
P.
&
Kemper
T.
(
2022
)
GHSL Data Package
.
Luxembourg
:
Publications Office of the European Union
.
Shahabi
H.
,
Shirzadi
A.
,
Ghaderi
K.
,
Omidvar
E.
,
Al-Ansari
N.
,
Clague
J. J.
,
Geertsema
M.
,
Khosravi
K.
,
Amini
A.
,
Bahrami
S.
,
Rahmati
O.
,
Habibi
K.
,
Mohammadi
A.
,
Nguyen
H.
,
Melesse
A. M.
,
Bin Ahmad
B.
&
Ahmad
A.
(
2020
)
Flood detection and susceptibility mapping using sentinel-1 remote sensing data and a machine learning approach: Hybrid intelligence of bagging ensemble based on k-nearest neighbor classifier
,
Remote Sensing
,
12
(
2
),
266
.
Song
J.
,
Gao
Y.
,
Yin
P.
,
Li
Y.
,
Li
Y.
,
Zhang
J.
,
Fu
X.
&
Pi
H.
(
2021
)
The random forest model has the best accuracy among the four pressure ulcer prediction models using machine learning algorithms
,
Risk Management and Healthcare Policy
,
14
,
1175
1187
.
Soria-Ruiz
J.
,
Fernandez-Ordoñez
Y. M.
,
Ambrosio-Ambrosio
J. P.
,
Escalona-Maurice
M. J.
,
Medina-García
G.
,
Sotelo-Ruiz
E. D.
&
Ramirez-Guzman
M. E.
(
2022
)
Flooded extent and depth analysis using optical and SAR remote sensing with machine learning algorithms
,
Atmosphere
,
13
(
11
),
1852
.
Stiglic
G.
,
Kocbek
S.
,
Pernek
I.
&
Kokol
P.
(
2012
)
Comprehensive decision tree models in bioinformatics
,
PLoS One
,
7
(
3
),
e33812
.
Sun
Z.
,
Wang
G.
,
Li
P.
,
Wang
H.
,
Zhang
M.
&
Liang
X.
(
2023
)
An improved random forest based on the classification accuracy and correlation measurement of decision trees
,
Expert Systems with Applications
,
237
,
121549
.
Sun
Y.
,
Zhang
Y.
&
Zhang
J.
(
2024
)
Adaboost algorithm combined multiple random forest models (Adaboost-RF) is employed for fluid prediction using well logging data
,
Physics of Fluids
,
36
(
1
),
016602
.
Tomaszewski
B.
(
2020
)
Geographic Information Systems (GIS) for Disaster Management
, 2nd edn.
New York, USA: Routledge
.
Xia, Y., Cheng, K., Cheng, Z., Rao, Y. & Pu, J. (2021)
GBMVis: Visual Analytics for Interpreting Gradient Boosting Machine
.
In: Luo, Y. (ed.) Cooperative Design, Visualization, and Engineering. CDVE 2021. Lecture Notes in Computer Science, vol. 12983. Springer, Cham. https://doi.org/10.1007/978-3-030-88207-5_7.
Zhang
X.
&
Wang
M.
(
2021
)
Weighted random forest algorithm based on Bayesian algorithm
,
Journal of Physics: Conference Series
,
1924
(
1
),
012006
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).