Abstract
In the context of global climate change and the continuous development of urban areas, rainfall-inundation modeling is a common approach that provides critical support for the protection and early warning of urban waterlogging protection. The present study conducts a data-driven model for hourly urban rainfall-inundation depth prediction, which is based on a gated recurrent unit (GRU) neural network and uses the simulated annealing (SA) algorithm for the hyperparameter optimization of GRU, namely the SA-GRU model. To verify the performance of the proposed model, backpropagation, long short-term memory (LSTM), and bidirectional LSTM (BiLSTM) neural networks are set as benchmarks. Results show that the SA-GRU has high accuracy in the case of short-term inundation prediction, with the Nash–Sutcliffe efficiency from 0.999 to 0.596 for the 1-h-ahead to 8-h-ahead predictions. And further research reveals that the SA-GRU integrates the significant optimization of SA, with an average 20% reduction of the root mean square error within the first eight prediction periods, and the efficient training speed of GRU, with 23.7% faster than LSTM and 44.2% faster than BiLSTM. In conclusion, the SA-GRU excels in urban inundation prediction, demonstrating its value in flood management and decision-making.
HIGHLIGHTS
A GRU-based neural network was used for urban waterlogging modeling to make inundation depth predictions for 1–12 h.
A simulated annealing (SA) algorithm was used as the hyperparameter optimization method to boost the performance of GRU.
The SA-GRU model has better performance rather than SA-BP, SA-LSTM, and SA-BiLSTM within its prediction period threshold (8 h).
INTRODUCTION
Over the past few decades, increasing urban flooding events have caused more severe disasters with the explosion of urbanization and the influence of climate change around the world (De Silva & Kawasaki 2020; Kao et al. 2021; Valeh et al. 2021; Wang J. et al. 2021; Wang P. et al. 2021). Urban flooding usually results in the loss of economic assets and livelihoods, disrupts the human daily routine and urban business activities, and seriously plagues urban development (Galloway et al. 2018; Wang et al. 2020). As the fastest-growing urban agglomerations in the world, many Asian cities experience the rapid transformation of the underlying surface, leading to a significant decrease in the capacity of water seepage and greater urban flooding risks in most areas (Yi & Yang 2014; Luo et al. 2018; Kao et al. 2021). For instance, China has experienced a rising incidence of urban flooding since 2010 (Govt 2020).
To cope with urban flooding and reduce losses, urban rainfall-flood models are used for inundation simulation and prediction. These models can be categorized as physical-based and data-driven models (Yuan et al. 2018; Tamiru & Dinka 2021). The biggest difference between the two types of models is that physical-based models are composed of equations based on physical phenomena, while data-driven models depend on data and statistical methods (Li et al. 2021). One modeling strategy for flood simulation and prediction in urban catchments involves coupling meteorological and urban hydrological models (Nguyen & Bae 2020), which facilitates a comprehensive simulation of the entire process from rainfall initiation to urban flood evolution. Numerous physically based models, such as the Hydrologic Engineering Center's River Analysis System (HEC-RAS), have been developed using this methodology and are widely applied in urban flood inundation simulation and risk analysis (Rangari et al. 2019; Feng et al. 2021; Madhuri et al. 2021).
Despite the continued dominance of traditional physically based models, they have encountered several critical challenges that have hindered their progress over time. One key limitation is the high computational demands associated with the complex processes and phenomena they represent. This can create difficulties for large-scale simulations or real-time flood forecasting, potentially requiring significant computational resources and restricting their applicability in certain situations (Kao et al. 2021). Moreover, physically based models typically necessitate extensive input data for calibration and validation, such as topographic, land use, and rainfall data. Acquiring high-quality, consistent data can be challenging, particularly in developing countries or remote areas, which can impact the accuracy and effectiveness of physical models in these contexts. Furthermore, physically based models involve numerous parameters that require calibration for the accurate representation of site-specific conditions (Herrera et al. 2022). The calibration process can be time-consuming and computationally demanding, and the models can be sensitive to parameter values, potentially introducing uncertainties in the results if calibration is not performed accurately.
With the increasing approaches to data acquisition and the growing information technology, data-driven models thrive rapidly and are gradually applied to the field of hydrology. Since the 1990s, artificial neural networks (ANNs) have been used for hydrological modeling (Hsu et al. 1995; Dawson & Wilby 1998; Abrahart et al. 2004; Asadi et al. 2013; Kan et al. 2015; Chang et al. 2018; Kao et al. 2021). To deal with time series tasks better, a class of ANNs was designed, named recurrent neural networks (RNNs, Rumelhart et al. 1986; Li et al. 2021). Depending on the unit structure, RNNs are divided into different types, including long short-term memory (LSTM) and its variant gated recurrent unit (GRU). To date, LSTM has obtained significant achievements in various directions in the field of hydrological forecasting, such as precipitation forecasting (Kumar et al. 2019), runoff and water-level forecasting (Kratzert et al. 2018; Widiasari et al. 2018; Yuan et al. 2018; Kao et al. 2020; Xiang et al. 2020; Zuo et al. 2020; Li et al. 2021), flood disaster early warning (de la Fuente et al. 2019), and groundwater forecasting (Zhang et al. 2018). However, the numerous trainable parameters of LSTM can lead to prolonged model training times (Wang J. et al. 2022). As a result, a simplified version of the LSTM architecture, the GRU, was introduced (Cho et al. 2014). The GRU has only two gates (update and reset) and lacks a separate memory cell, resulting in fewer parameters, faster training times, and lower computational requirements compared to LSTMs (Fu et al. 2016; Chen et al. 2022). Therefore, the potential application and value of GRU attract the attention of some hydrologists and researchers recently. Gao et al. (2020) compared LSTM and the GRU in short-term runoff predictions and concluded that the GRU might be the preferred method. In terms of the direction of groundwater forecasting and water quality prediction, the latest research also shows good application prospects for the GRU (Gharehbaghi et al. 2022; Mei et al. 2022).
Apart from the unit structure, the performance of a neural network largely depends on its hyperparameters that define the model architecture (Kuhn & Johnson 2013; Roy et al. 2014), such as the number of layers and the number of neurons in each layer. Therefore, tuning hyperparameters is essential for building an effective neural network model (Gressling 2020). Traditional empirical hyperparameter optimization involves a manual trial-and-error process, where researchers experiment with different configurations of hyperparameters based on their experience, intuition, or previous research findings. This approach can be time-consuming and may not guarantee optimal results, as it relies on human judgment and expertise. In contrast, systematic, algorithm-driven search processes employ algorithms that systematically explore the hyperparameter space to find the best configuration of hyperparameters, independent of human intervention. These algorithm-driven methods have the advantage of being more objective, efficient, and reduce human effort while enhancing model performance and reproducibility (Yang & Shami 2020). Simulated annealing (SA) is a modern heuristic that has been extensively applied to solve optimization problems (Avello et al. 2005). Wang Z. et al. (2022) applied SA with a particle swarm algorithm to obtain the optimum allocation schemes of water resources at precipitation frequencies. Wang J. et al. (2021) used the driving force-pressure-state-impact-response-management framework model and the SA-projection pursuit model to calculate the water resource carrying capacity to determine the main factors affecting the lake ecosystem. Hosseini et al. (2020) used SA to eliminate redundant variables for the temporal flash-flood forecasting models building to enhance the accuracy of assessing hazardous areas. However, compared with other optimization algorithms (Yuan et al. 2018; Yang & Shami 2020; Xu et al. 2022), there are few pieces of research using SA for hyperparameter optimization in neural networks.
The purpose of this study is to develop the GRU neural network boosted by an SA-based hyperparameter optimization algorithm (SA-GRU) for urban flood inundation depth prediction. Benefiting from the advantages of the accuracy and speed of GRU in time series forecasting, especially with the improvement of SA-based hyperparameter optimization, the proposed model has great potential to break through the technical bottleneck that traditional physical-based models have. Meanwhile, the feasibility verification done in this study can also be preliminary work to make modern algorithms a solution worth considering for real-time urban flood-inundation prediction. The dataset employed for neural networks comprises rainfall time series and inundation depth time series. Most of the rainfall data was sourced from the local administration office in Hefei City, China, while a small portion was based on designed rainfall. Inundation data was generated by the HEC-RAS model, which has been validated as a feasible method for simulating urban flood events in real projects. To verify the performance of the proposed model in various prediction periods, we tested the SA-GRU for the next 1–12 h of inundation depth prediction separately. Furthermore, several other neural networks were set as benchmarks. On the one hand, the original GRU with the empirical hyperparameter configuration was compared with the SA-GRU to explore the improvement of SA on GRU. On the other hand, to further validate the superiority of SA-GRU for urban flood-inundation depth prediction, SA was also adopted to optimize the hyperparameters of backpropagation (BP) neural network, LSTM, and bidirectional LSTM (BiLSTM), respectively, and their performance was compared in each prediction period.
The rest of this study is organized as follows. Section 2 overviews the study area and dataset. Section 3 designs the framework of experiments and introduces the relevant methods. Section 4 presents the results and the discussions on the performance of the proposed SA-GRU model for urban inundation depth prediction. Eventually, conclusions are then drawn in Section 5.
MATERIAL AND METHODS
Study area and dataset
To predict future inundation depths based on historical rainfall-inundation data without involving other features, the dataset in this study consists of rainfall time series and inundation depth time series. The rainfall time series were composed of observed and designed rainfall events, and due to the lack of observed inundation data, the inundation time series were generated by the HEC-RAS model based on the rainfall time series and relevant topographic information. Among these, the observed rainfall data were acquired from http://112.30.184.124:18001/ExternalAdmin, spanning from 2010 to 2021. As the relationship between peak rainfall intensity (PRI) and maximum flood depth exhibits an exponential trend (Sahoo & Sreeja 2016), rainfall events were selected with a PRI greater than 10 mm/h to yield more pronounced inundation processes. Ultimately, the rainfall time series included 57 observed rainfall events and 8 designed rainfall events. After applying several data preprocessing methods (detailed in Section 3.4.3.), the resulting dataset contained 65 sets of rainfall-inundation time series at 15-min intervals. The entire rainfall time series includes a total of 10,765-time instants, with the longest-lasting rainfall event consisting of 376-time instants (nearly 4 days). Since it becomes challenging to forecast inundation with excessively long prediction periods, the prediction period threshold was tentatively set to 12 h, resulting in every inundation time series being 12 h longer than the rainfall time series. Consequently, the dataset contains a total of 13,885 inundation time instants (10,765 + 12 × 4 × 65). A comprehensive overview of the dataset is provided in Table 1.
. | Rainfall . | Inundation depth . |
---|---|---|
Total events | 65 | 65 |
Data source | Observed (57) and designed (8) | Generated |
Unit | Millimeter per hour (mm/h) | Meter (m) |
Data amount | 10,765 time instants | 13,885 time instants |
Longest single event | 376 time instants (94 h) | 424 time instants (106 h) |
Average value | 3.4 mm/h | 0.42 m |
Minimum value | 0 mm/h | 0 m |
Maximum value | 62.5 mm/h | 2.14 m |
Standard deviation | 6.6 mm/h | 0.36 m |
. | Rainfall . | Inundation depth . |
---|---|---|
Total events | 65 | 65 |
Data source | Observed (57) and designed (8) | Generated |
Unit | Millimeter per hour (mm/h) | Meter (m) |
Data amount | 10,765 time instants | 13,885 time instants |
Longest single event | 376 time instants (94 h) | 424 time instants (106 h) |
Average value | 3.4 mm/h | 0.42 m |
Minimum value | 0 mm/h | 0 m |
Maximum value | 62.5 mm/h | 2.14 m |
Standard deviation | 6.6 mm/h | 0.36 m |
Experimental framework
- 1.
Evaluation of the SA-GRU across various prediction periods: To verify the feasibility of SA-GRU in urban inundation depth prediction and determine the predictable duration, this experiment presented the forecasting results of inundation depth based on the SA-GRU in different prediction periods. Through several tests, 1–12 h was found as an appropriate range of prediction periods.
- 2.
Comparison between the original GRU and the SA-GRU: To quantify and analyze the improvement of SA on the predictive capacity of the GRU, an original GRU was developed as the benchmark with the empirical hyperparameter configuration given by Gao et al. (2020). The two models were compared in all 12 prediction periods. In addition, since the hyperparameter optimization progress usually requires massive iterations in which massive corresponding GRU networks with different hyperparameter configurations were developed, another method was added to verify the effect of SA on the GRU by analyzing the differences between the optimal and the average performance of these GRU networks.
- 3.
Exploration of the effect of SA on different neural networks: In this experiment, the BP neural network and other two RNNs (LSTM and BiLSTM) were all developed with the SA-based hyperparameter optimization algorithm, building SA-BP, SA-LSTM, and SA-BiLSTM, respectively. Based on the type of neural networks, two aspects were researched with these benchmarks. On the one hand, SA-BP was compared with the SA-GRU to verify whether the GRU still takes the advantage of RNNs in the specific urban inundation depth prediction as long as other time series prediction tasks. On the other hand, to explore the differences between the mentioned three RNNs, SA-LSTM and SA-BiLSTM were compared to the SA-GRU.
LSTM and GRU neural networks
LSTM neural network
Besides standard LSTM, BiLSTM is also widely used, which consists of two LSTMs, one taking the input in a forward direction and the other in a backward direction. For some specific tasks, BiLSTM effectively increases the amount of information and learns deeper than one-directional LSTM. Therefore, BiLSTM is considered another benchmark of GRU in this study.
GRU neural network
Compared with LSTM structurally, the GRU does not have an output gate and combines the input and the forget gates into a single update gate. Therefore, the GRU is less complex and executes faster than LSTM, whereas LSTM is more accurate on a larger sequence theoretically. In this study, firstly, the amount of dataset used for experiments is not that much. What's more, since the hyperparameters of neural networks were designed to be optimized by SA, a great number of iterations result in longer computation time and thus require faster training speed. So the GRU is considered the more appropriate neural network for the specific task of this study, especially if applied to real-time urban inundation depth prediction in the future.
SA algorithm
SA is a stochastic optimization algorithm inspired by the annealing process in metallurgy, where metals are heated to high temperatures and then slowly cooled to reduce defects and improve their structural properties (Kirkpatrick et al. 1983). The algorithm is particularly useful for finding approximate solutions to complex optimization problems, especially those with a large search space or numerous local optima (Jansen 2011). The key principles of SA can be summarized as follows.
- 1.
Initialization: Define an initial solution, an objective function (energy function) to evaluate the quality of solutions, a temperature schedule, and an initial temperature value. The temperature represents a control parameter that determines the degree of randomness allowed in the search process.
- 2.
Perturbation: Generate a new solution by applying a small, random modification to the current solution. This step is analogous to the movement of atoms in a heated metal.
- 3.
Evaluation: Calculate the change in the objective function between the current solution and the new solution.
- 4.
Acceptance: Accept or reject the new solution based on the change in the objective function and the current temperature. If the new solution is better, it is always accepted. If it is worse, it may still be accepted with a probability that decreases as the temperature decreases.
- 5.
Cooling: Decrease the temperature according to the temperature schedule. This gradual decrease in temperature is analogous to the slow cooling process in metallurgy, which allows the system to settle into a low-energy state.
- 6.
Iteration: Repeat the perturbation, evaluation, and acceptance steps until a stopping criterion is met, such as reaching a minimum temperature or a maximum number of iterations.
This acceptance and retention procedure allows the algorithm to explore the search space more effectively, as it helps to escape local optima and eventually converges to a global optimum or near-optimal solution. The gradual reduction in temperature ensures that the algorithm becomes more selective over time, accepting only better solutions as it approaches convergence.
Next, the control parameters of SA are initialized. In this study, the initial temperature and the termination condition were set to 100° and 100 iterations, respectively. According to the Metropolis criterion, SA has a certain probability of escaping local minima, and the probability decreases as the temperature decreases, with the optimal solution gradually converging. Finally, the optimal hyperparameter configuration is obtained when the termination condition is reached.
A hybrid SA-GRU model for urban inundation depth prediction
Based on methods of GRU and the SA-based hyperparameter optimization, there are two major objects that request the definition for building the hybrid SA-GRU model: key hyperparameters (including their value sets) and evaluation criteria. In addition, several data preprocessing methods are commonly conducted before neural network calibration.
Key hyperparameters
According to the features of neural networks and the purpose of this study, three sets of key hyperparameters and their value sets are defined, as shown in Table 2. The first set is related to the architecture of GRU neural networks, including the number of hidden layers and the number of neurons in each layer. The second set comprises the minimum batch size and the number of epochs, representing the number of processed samples before updating the model and the number of complete passes through the entire calibration set, respectively (Soon et al. 2018). The minimum batch size depends on the resource requirements of the training process and the number of iterations. The number of epochs affects training results and time. If epochs are insufficient, the model is prone to underfit. Instead, if epochs are too much, it will lead to unnecessary extra execution time, even overfitting. To reduce optimization time, the maximum value of epochs is set to 100, which has been tested to be sufficient for convergence. Moreover, a suitable number of neurons in the input layer is also expected, since if the input sequence is too short or too long, that results in poor performance of models (Gao et al. 2020). Therefore, the length of the sequence, or the neurons of the input layer, is considered as the third set of hyperparameters, ranging from 4 to 12 h (corresponding neurons quadruple based on the 15-min intervals).
Hyperparameters . | Value sets . |
---|---|
The number of hidden layers | {1, 2, 3} |
The number of neurons in each layer | {4, 8, 16, 32, 64, 128, 256} |
Mini-batch size | {32, 64, 128, 256} |
The number of epochs | {30, 40, 50, 60, 70, 80, 90, 100} |
The length of the input sequence | {16, 20, 24, 28, 32, 36, 40, 44, 48} |
Hyperparameters . | Value sets . |
---|---|
The number of hidden layers | {1, 2, 3} |
The number of neurons in each layer | {4, 8, 16, 32, 64, 128, 256} |
Mini-batch size | {32, 64, 128, 256} |
The number of epochs | {30, 40, 50, 60, 70, 80, 90, 100} |
The length of the input sequence | {16, 20, 24, 28, 32, 36, 40, 44, 48} |
Evaluation criteria
The NSE is a dimensionless efficiency index and was first proposed by Nash and Sutcliffe in 1970 to evaluate hydrological simulations. The value of NSE ranges from −∞ to 1, the closer to 1, the better the models. Research shows that the NSE has a higher sensitivity to peak values (Waseem et al. 2008), which mainly helps to evaluate the accuracy of models for the maximum inundation depth prediction.
RMSE and MAE are dimensional (in meters in this study) and mathematical criteria that evaluate distances between the objective and the predicted values. Both errors are greater than 0. It is suggested that MAE is less sensitive to outliers and that it is an unambiguous measure of average error magnitude compared with RMSE (Willmott & Matsuura 2005).
In most hydrological model evaluations, model performance can be categorized into four levels based on NSE values: (0.75, 1], (0.65, 0.75], (0.5, 0.65], and (−∞, 0.5]. To assess model performance more precisely, a six-level performance rating system, which is applicable to this study and based on NSE, RMSE, and MAE, is employed, as shown in Table 3.
NSE range . | RMSE range . | MAE range . | Performance rating . |
---|---|---|---|
(0.9, 1] | (0, 0.10] | (0, 0.08] | Excellent |
(0.75, 0.9] | (0.10, 0.20] | (0.08, 0.15] | Very good |
(0.65, 0.75] | (0.20, 0.25] | (0.15, 0.20] | Good |
(0.6, 0.65] | (0.25, 0.30] | (0.20, 0.25] | Satisfactory |
(0.5, 0.6] | (0.30, 0.35] | (0.25, 0.30] | Acceptable |
(−∞, 0.5] | (0.35, +∞] | (0.30, +∞] | Unacceptable |
NSE range . | RMSE range . | MAE range . | Performance rating . |
---|---|---|---|
(0.9, 1] | (0, 0.10] | (0, 0.08] | Excellent |
(0.75, 0.9] | (0.10, 0.20] | (0.08, 0.15] | Very good |
(0.65, 0.75] | (0.20, 0.25] | (0.15, 0.20] | Good |
(0.6, 0.65] | (0.25, 0.30] | (0.20, 0.25] | Satisfactory |
(0.5, 0.6] | (0.30, 0.35] | (0.25, 0.30] | Acceptable |
(−∞, 0.5] | (0.35, +∞] | (0.30, +∞] | Unacceptable |
Model development
As indicated in Table 4, the dataset was split into calibration, validation, and test datasets. Utilizing the model development methods, experimental framework, and evaluation criteria described earlier, the SA-GRU model and benchmark models were developed and evaluated.
Datasets . | Data ratio (%) . |
---|---|
Calibration set | 72 |
Validation set | 18 |
Test set | 10 |
Datasets . | Data ratio (%) . |
---|---|
Calibration set | 72 |
Validation set | 18 |
Test set | 10 |
RESULTS AND DISCUSSION
All experiments are conducted with the hardware equipment Intel(R) Core (TM) i5-1135G7 CPU @2.40 GHz, RAM 16.00 GB, and NVIDIA GeForce MX450. The programming language for data processing and model development is Python 3.9.7, with related packages including Numpy 1.19.5, Pandas 1.4.1, Matplotlib 3.5.1, TensorFlow 2.6.0, and Hyperopt 0.2.7.
Results of the SA-GRU in different prediction periods
Table 5 presents the optimal hyperparameter configuration in each prediction period. The result shows that the optimal length of the input sequence is 20 neurons (5 h), which verifies that it is the appropriate one, not the longer input sequence length that can be helpful to improve the performance of GRU. In terms of the number of hidden layers, it is found that networks in 10 of 12 prediction periods are optimized to the architecture of one single hidden layer with 64 neurons. A one hidden layer network is likely to fit better than multilayer networks for rainfall-inundation prediction. As for the number of epochs, it is shown that 80–100 can be a suitable range. The optimal minimum batch size is 32.
Prediction period (h) . | The length of the input sequence . | The number of hidden layers . | The number of neurons in each layer . | The number of epochs . | Minimum batch size . |
---|---|---|---|---|---|
1 | 20 (5 h) | 1 | [64] | 90 | 32 |
2 | 20 (5 h) | 2 | [64, 32] | 90 | 32 |
3 | 16 (4 h) | 2 | [64, 32] | 90 | 32 |
4 | 24 (6 h) | 1 | [64] | 100 | 32 |
5 | 16 (4 h) | 1 | [64] | 90 | 32 |
6 | 24 (6 h) | 1 | [32] | 80 | 32 |
7 | 20 (5 h) | 1 | [64] | 90 | 32 |
8 | 16 (4 h) | 1 | [64] | 80 | 32 |
9 | 32 (8 h) | 1 | [64] | 100 | 32 |
10 | 20 (5 h) | 1 | [32] | 90 | 32 |
11 | 20 (5 h) | 1 | [64] | 90 | 32 |
12 | 24 (6 h) | 1 | [64] | 80 | 32 |
Prediction period (h) . | The length of the input sequence . | The number of hidden layers . | The number of neurons in each layer . | The number of epochs . | Minimum batch size . |
---|---|---|---|---|---|
1 | 20 (5 h) | 1 | [64] | 90 | 32 |
2 | 20 (5 h) | 2 | [64, 32] | 90 | 32 |
3 | 16 (4 h) | 2 | [64, 32] | 90 | 32 |
4 | 24 (6 h) | 1 | [64] | 100 | 32 |
5 | 16 (4 h) | 1 | [64] | 90 | 32 |
6 | 24 (6 h) | 1 | [32] | 80 | 32 |
7 | 20 (5 h) | 1 | [64] | 90 | 32 |
8 | 16 (4 h) | 1 | [64] | 80 | 32 |
9 | 32 (8 h) | 1 | [64] | 100 | 32 |
10 | 20 (5 h) | 1 | [32] | 90 | 32 |
11 | 20 (5 h) | 1 | [64] | 90 | 32 |
12 | 24 (6 h) | 1 | [64] | 80 | 32 |
Table 6 demonstrates the performance of the SA-GRU model in predicting urban inundation depth across 1–12-h prediction periods. The model is assessed using both validation and test sets based on NSE, RMSE, MAE, and performance rating.
Prediction period (h) . | Dataset . | NSE . | RMSE(m) . | MAE (m) . | Performance rating . |
---|---|---|---|---|---|
1 | Validation set | 0.999 | 0.03 | 0.01 | Excellent |
Test set | 0.999 | 0.03 | 0.01 | Excellent | |
2 | Validation set | 0.995 | 0.03 | 0.02 | Excellent |
Test set | 0.995 | 0.03 | 0.02 | Excellent | |
3 | Validation set | 0.978 | 0.06 | 0.04 | Excellent |
Test set | 0.976 | 0.06 | 0.04 | Excellent | |
4 | Validation set | 0.943 | 0.10 | 0.07 | Excellent |
Test set | 0.940 | 0.10 | 0.07 | Excellent | |
5 | Validation set | 0.853 | 0.16 | 0.10 | Very good |
Test set | 0.845 | 0.16 | 0.10 | Very good | |
6 | Validation set | 0.768 | 0.20 | 0.13 | Very good |
Test set | 0.751 | 0.20 | 0.13 | Very good | |
7 | Validation set | 0.657 | 0.24 | 0.17 | Good |
Test set | 0.643 | 0.24 | 0.17 | Satisfactory | |
8 | Validation set | 0.596 | 0.26 | 0.17 | Acceptable |
Test set | 0.582 | 0.26 | 0.17 | Acceptable | |
9 | Validation set | 0.493 | 0.29 | 0.20 | Unacceptable |
Test set | 0.489 | 0.29 | 0.19 | Unacceptable | |
10 | Validation set | 0.418 | 0.30 | 0.21 | Unacceptable |
Test set | 0.399 | 0.31 | 0.21 | Unacceptable | |
11 | Validation set | 0.371 | 0.30 | 0.21 | Unacceptable |
Test set | 0.335 | 0.31 | 0.22 | Unacceptable | |
12 | Validation set | 0.212 | 0.33 | 0.23 | Unacceptable |
Test set | 0.197 | 0.34 | 0.23 | Unacceptable |
Prediction period (h) . | Dataset . | NSE . | RMSE(m) . | MAE (m) . | Performance rating . |
---|---|---|---|---|---|
1 | Validation set | 0.999 | 0.03 | 0.01 | Excellent |
Test set | 0.999 | 0.03 | 0.01 | Excellent | |
2 | Validation set | 0.995 | 0.03 | 0.02 | Excellent |
Test set | 0.995 | 0.03 | 0.02 | Excellent | |
3 | Validation set | 0.978 | 0.06 | 0.04 | Excellent |
Test set | 0.976 | 0.06 | 0.04 | Excellent | |
4 | Validation set | 0.943 | 0.10 | 0.07 | Excellent |
Test set | 0.940 | 0.10 | 0.07 | Excellent | |
5 | Validation set | 0.853 | 0.16 | 0.10 | Very good |
Test set | 0.845 | 0.16 | 0.10 | Very good | |
6 | Validation set | 0.768 | 0.20 | 0.13 | Very good |
Test set | 0.751 | 0.20 | 0.13 | Very good | |
7 | Validation set | 0.657 | 0.24 | 0.17 | Good |
Test set | 0.643 | 0.24 | 0.17 | Satisfactory | |
8 | Validation set | 0.596 | 0.26 | 0.17 | Acceptable |
Test set | 0.582 | 0.26 | 0.17 | Acceptable | |
9 | Validation set | 0.493 | 0.29 | 0.20 | Unacceptable |
Test set | 0.489 | 0.29 | 0.19 | Unacceptable | |
10 | Validation set | 0.418 | 0.30 | 0.21 | Unacceptable |
Test set | 0.399 | 0.31 | 0.21 | Unacceptable | |
11 | Validation set | 0.371 | 0.30 | 0.21 | Unacceptable |
Test set | 0.335 | 0.31 | 0.22 | Unacceptable | |
12 | Validation set | 0.212 | 0.33 | 0.23 | Unacceptable |
Test set | 0.197 | 0.34 | 0.23 | Unacceptable |
In the initial prediction periods of 1–4 h, the SA-GRU model performs exceptionally well for both validation and test sets. The NSE values are 0.999 for both sets, with low RMSE (0.03) and MAE (0.01) values. As the prediction period extends to 5 and 6 h, the model maintains a very good performance level on both sets, with a slight decrease in the test set performance. At the 6-h prediction period, the NSE value for the validation set is 0.768, while it is 0.751 for the test set.
For prediction periods of 7 and 8 h, the performance of SA-GRU shows good and acceptable levels, respectively, on both sets. The NSE values experience a significant reduction, and the RMSE and MAE values increase. At the 8-h prediction period, the NSE value for the validation set is 0.596, while it is 0.582 for the test set. When the prediction period surpasses 8 h (9–12 h), the performance of SA-GRU is considered unacceptable on both sets. For instance, at the 9-h and 12-h prediction periods, the NSE values for the test set are 0.489 and 0.197, respectively.
In conclusion, the SA-GRU model excels in providing highly accurate predictions for urban inundation depth in the short term, showcasing its strength in these crucial timeframes. Despite the decreasing precision with extended prediction periods, the SA-GRU model continues to offer effective solutions for short-term rainfall-inundation predictions.
Comparison between the original GRU and the SA-GRU
The experiment aims to evaluate the improvement of SA on the GRU. The original GRU is set as the benchmark, with the empirical hyperparameter configuration given by Gao et al. (2020).
Prediction period (h) . | NSE . | RMSE (m) . | MAE (m) . |
---|---|---|---|
1 | 0.999 | 0.03 | 0.01 |
2 | 0.991 | 0.05 | 0.03 |
3 | 0.952 | 0.09 | 0.06 |
4 | 0.860 | 0.16 | 0.11 |
5 | 0.795 | 0.19 | 0.12 |
6 | 0.685 | 0.23 | 0.15 |
7 | 0.608 | 0.26 | 0.18 |
8 | 0.495 | 0.29 | 0.19 |
9 | 0.277 | 0.34 | 0.24 |
10 | 0.212 | 0.35 | 0.26 |
11 | 0.096 | 0.37 | 0.26 |
12 | −0.089 | 0.40 | 0.28 |
Prediction period (h) . | NSE . | RMSE (m) . | MAE (m) . |
---|---|---|---|
1 | 0.999 | 0.03 | 0.01 |
2 | 0.991 | 0.05 | 0.03 |
3 | 0.952 | 0.09 | 0.06 |
4 | 0.860 | 0.16 | 0.11 |
5 | 0.795 | 0.19 | 0.12 |
6 | 0.685 | 0.23 | 0.15 |
7 | 0.608 | 0.26 | 0.18 |
8 | 0.495 | 0.29 | 0.19 |
9 | 0.277 | 0.34 | 0.24 |
10 | 0.212 | 0.35 | 0.26 |
11 | 0.096 | 0.37 | 0.26 |
12 | −0.089 | 0.40 | 0.28 |
In summary, by comparing the original GRU and the SA-GRU and exploring the iteration progress of SA itself, the significant effect of the SA-based hyperparameter optimization improves the performance of GRU for urban inundation depth predictions, contributing to not only the increase of precision but also the 1-h extension of the prediction period threshold.
Exploration of the effect of SA on different neural networks
For shorter prediction periods, such as the 1–3-h periods, all three RNN models demonstrate marked performance over the BP neural network, with nearly identical NSE, RMSE, and MAE values. The substantial difference in performance during these early stages of prediction showcases the inherent advantage of RNNs for time series prediction. Examining the 4–6-h prediction periods, the SA-GRU model exhibits a slight edge over the other RNNs, as evidenced by its marginally higher NSE values and lower RMSE and MAE values. However, the distinctions among the RNNs remain subtle, suggesting that each of these RNNs is capable of delivering commendable performance in urban inundation depth prediction. In the context of longer prediction periods, the prediction accuracy of all RNNs demonstrates a similar decline.
Results show that the SA-GRU has the fastest average training speed, with 139.2 seconds per trial, which is significantly faster than SA-LSTM and SA-BiLSTM, which have average training speeds of 182.5 and 249.5 seconds per trial, respectively. Correspondingly, regarding the average optimization time, the SA-GRU boasts the shortest duration of 3.9 h per optimization progress. In contrast, SA-LSTM and SA-BiLSTM require longer optimization times of 5.1 and 6.9 h, respectively. This difference in training speed means that the SA-GRU model can be trained and optimized more quickly (Siami-Namini et al. 2019; Ozdemir et al. 2022), allowing for more efficient use of resources and time.
In summary, in addition to its slight accuracy advantage over SA-LSTM and SA-BiLSTM, the SA-GRU model is also more time and resource-efficient. Therefore, the SA-GRU could be the preferred choice for urban flood inundation prediction, considering the balance of model performance with computational costs and time constraints.
CONCLUSIONS
In this study, we evaluate the efficacy of hyperparameter optimization based on the SA algorithm for predicting urban inundation depth using the GRU. The performance of the proposed SA-GRU model is compared with other benchmark models, including the original GRU, SA-LSTM, SA-BiLSTM, and SA-BP across various prediction periods. The main conclusions of this study can be summarized as follows.
- 1.
By comparing the results of prediction periods from 1 to 12, the SA-GRU performs very well for short-term predictions (1–6 h). Since the prediction accuracy of SA-GRU decreases with the extension of prediction periods, the longest prediction period is limited to 8 h: for the 1-h-ahead prediction, NSE, RMSE, and MAE of the proposed model rate obtain 0.999, 0.03, and 0.01, respectively; for the 8-h-ahead limit predictions, NSE, RMSE, and MAE obtain 0.582, 0.26, and 0.17, respectively.
- 2.
By comparing the SA-GRU with the original GRU, results show that SA optimization considerably enhances the performance of neural networks. The SA-GRU model consistently outperforms the original GRU model across all prediction periods, which highlights the importance of hyperparameter optimization. This finding contributes to the existing body of research by demonstrating the value of SA optimization in refining neural network performance for urban flood prediction tasks.
- 3.
RNN models, including the SA-GRU, outperform the BP neural network in predicting urban inundation depth, which underscores that the inherent advantages of RNNs for the time series prediction tasks. Among the three SA-RNN models (SA-GRU, SA-LSTM, and SA-BiLSTM), the SA-GRU has subtle superiority in terms of prediction accuracy and costs less time for optimization (on average, 3.9 h), rendering it more time and resource-efficient.
- 4.
However, there are two main drawbacks worthy of attention. On the one hand, with the extension of prediction periods, uncertainty in the future leads to a decrease in the prediction accuracy of all the models. Therefore, future work could explore alternative optimization techniques, additional features, and data sources to reduce uncertainty and address the challenges in long-term urban inundation depth prediction. On the other hand, the lack of physical explanation is a common problem with black-box models, which affects the adjustability and reliability of the models. Therefore, it is critical to explore and develop physical-informed data-driven models.
In summary, the SA-GRU shows its potential in urban inundation depth prediction with high accuracy for short- to medium-term predictions. However, the limitations for longer-term predictions indicate the need for further research and improvements.
ACKNOWLEDGEMENTS
This work was supported by the National Key Research and Development Program of China (Grant No. 2019YFC1510204), the National Natural Science Foundation of China (Grant No. 42175177 and U2240216), and Special Basic Research Key Fund for Central Public Scientific Research Institutes (Grant No. Y521002), the National Natural Science Foundation of China (Grant No. 91847301 and 92047203). The authors would like to thank all experts for their contributions to urban flood protection and control.
AUTHOR CONTRIBUTIONS
Y.Y. performed methodology, format analysis, software, programming, drafting the article, critical reversion. W.Z. conducted methodology and format analysis, wrote the review and edited the manuscript and critical reversion. Y.L. carried out methodology, format analysis, software, data collection, and data curation. Z.L. did format analysis, data analysis, and data curation.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.