Water security and urban flooding have become major sustainability issues. This paper presents a novel method to introduce rates of change as the state-of-the-art approach in artificial intelligence model development for sustainability agenda. Multi-layer perceptron (MLP) and deep learning long short-term memory (LSTM) models were considered for flood forecasting. Historical rainfall data from 2008 to 2021 at 11 telemetry stations were obtained to predict flow at the confluence between Klang River and Ampang River. The initial results of MLP yielded poor performance beneath normal expectations, which was R = 0.4465, MAE = 3.7135, NSE = 0.1994 and RMSE = 8.8556. Meanwhile, the LSTM model generated a 45% improvement in its R-value up to 0.9055. Detailed investigations found that the redundancy of data input that yielded multiple target values had distorted the model performance. Qt was introduced into input parameters to solve this issue, while Qt+0.5 was the target value. A significant improvement in the results was detected with R = 0.9359, MAE = 0.7722, NSE = 0.8756 and RMSE = 3.4911. When the rates of change were employed, an impressive improvement was seen for the plot of actual vs. forecasted flow. Findings showed that the rates of change could reduce forecast errors and were helpful as an additional layer of early flood detection.

  • A highly accurate flood forecasting system based on deep learning has been developed.

  • Multi-lead ahead forecasting for streamflow has been investigated.

  • The novel architecture model has been successfully applied for controlling streamflow in SMART tunnel.

  • The proposed new model architecture could be applied to forecast river streamflow in different hydrological areas

In recent years, climate change has seriously affected the ecosystem (Han et al. 2022). Climate change in the form of unpredictable precipitation, temperature and evaporation patterns not only affects the surrounding ecosystem but can disrupt the hydrological trait of streamflow. Streamflow is a major element of the hydrological cycle. The attribute of streamflow is highly associated with climate and land-use conditions (Masrur Ahmed et al. 2021). Alterations made by human activities can accelerate instabilities in the temperature and rainfall patterns resulting in adverse conditions such as sea level rise and extreme weather (Adikari et al. 2021). Similarly, one of the critical issues when land cover is being altered is the loss or reduction of allowable areas for infiltration. The change can result in more runoff into the river. Failure to mitigate or adapt this change will cause overflow at the riverbanks and a major flood.

The global threat of urban flooding to water security and the economy is too enormous to be ignored. Major flooding can cause severe damage to the infrastructure. The mudflow often ruins belongings. Thus, a resilient approach is necessary to mitigate this peril. A good and reliable multi-step ahead forecasting model can be a potential risk management solution for better flood management and disaster preparedness to allow sufficient evacuation and asset protection (Kao et al. 2021; Nanditha & Mishra 2021).

Process-driven model vs. data-driven model

There are two approaches to developing a multi-step ahead forecasting model, the process-driven model and the data-driven model (Huang et al. 2021).

The process-driven model, also known as a physically based model, is derived from the physics mass conservation theory and momentum preservation. Physical data, such as precipitation, river alignment and hydraulic structures, such as culverts, weirs and dams, or other geological past evidence are required from physical sites through observation and numerous ground surveys, interviews, satellite and aerial photography. Various assumptions can bring uncertainties to these data (Teng et al. 2017). Specific parameters cannot be obtained directly and must be substituted with default parameters. The watershed in the model will be represented as lumped, semi-distributed or fully distributed (Cai & Yu 2022). It concurrently will be defined under the range of complexity from conceptual to physically specified (Bourdin et al. 2012). Forecasting activity will be developed based on the principle of flood formation, considering physical features, hydrology, hydrodynamics and other theoretical aspects (Chen et al. 2021).

Data-driven model has gained considerable interest in recent years and is regarded as an alternative to the process-based model. Data-driven model does not require the simulation of physical processes. The model employs historical values or multivariate models to forecast the future (Zhao et al. 2021). The model can retrieve information from linear, nonlinear and hydrological systems to map a relationship between observed parameters and target variables (Alizadeh et al. 2021). Artificial neural network (ANN) is the favorite black box model. However, clarity or interpretability has become a concern for machine learning (Kaadoud et al. 2022).

The deep learning model refers to the data-driven model with multiple hidden layers, allowing more feature extraction with higher efficiency. The learning algorithm using enormous computing power delivers better extraction results with more significant amounts of complex data. Thus, a more complex mapping has become possible in hydrological analysis. Long short-term memory (LSTM) is a popular hydrology and water resources model. It can solve issues related to long sequences of training that often will end up with vanishing gradients. Compared with other neural networks, LSTM can perform much better with time-varying time-series data. The model can deliver good results for a single-point short-term forecast. Otherwise, the performance will gradually deteriorate for longer-term forecasting (Lv et al. 2020).

Related work

The LSTM model is one of the deep learning models that have been extensively proposed in research as a tool for knowledge extraction (Kaadoud et al. 2022), medical diagnosis (Balaji et al. 2021), electricity prediction (Li & Becker 2021), soil moisture projection (Li et al. 2022), runoff forecasting (Chen et al. 2020) and precipitation nowcasting (Lin et al. 2021).

Alizadeh et al. (2021) successfully concluded a post-processing streamflow simulation using LSTM. A combination of the first order difference (DIFF)-FFNN-LSTM model was considered to solve the limitation of machine learning on nonstationarity data points (Lin et al. 2021). Kilinc & Haznedar (2022) developed the LSTM-genetic algorithm (GA) model for river streamflow modeling using daily flow data. Hayder et al. (2022) employed nonlinear autoregressive with exogenous inputs (NARX) & LSTM models using historical streamflow, weighted rainfall and average evaporation as the input parameters to simulate river streamflow. Watershed et al. (2022) utilized rainfall data in multi-layer perceptron (MLP) to predict river stage. Feng et al. (2022) developed the parallel cooperation search algorithm-extreme learning machine (PCSA-ELM) model from daily runoff to predict streamflow. Guo et al. (2021) used rainfall, river stage and tidal level data to initiate light gradient boosting machine learning regression (LGBMR) for river stage forecast. Jougla & Leconte (2022) experimented with short-term streamflow forecast using ANN models with inputs combinations of observed streamflow, surface, deep and surface soil moisture. Mostaghimzadeh et al. (2022) investigated the MLP-GA model for inflow forecast for a reservoir. Hai Nguyen et al. (2022) addressed urban flooding using the GA-Bayesian additive regression tree (BART) model for river flow forecast. Liu et al. (2022) proposed a directed graph neural network for multi-step streamflow forecast using streamflow and precipitation inputs. Danandeh Mehr et al. (2022) introduced a new genetic programming-seasonal autoregressive integrated moving average (GP-SARIMA) river flow model in streamflow simulation. Kilinc (2022) conducted a streamflow forecast based on the particle swarm optimization (PSO)-LSTM model in the Orontes Basin. Upon detailed observation of the research works performed in recent years, it could be concluded that there was no attempt to use rates of change in the AI model development. The past and current research mainly concentrated on using raw data to forecast streamflow with multiple variations of models.

In this context, the study is motivated to introduce a suitable deep learning model to simulate and forecast streamflow in an early warning system. The outcome will lead to higher accuracy in forecast flow and minimize the potential of progressive accumulation of errors in time-series analysis.

The contributions of this paper can be simplified as follows:

  • (1)

    The MLP model, a feed-forward neural network (FFNN) identified as the ANN model in section 2.1.1 of the methodology, will be performed in early analysis and set as the benchmark model of this study.

  • (2)

    This study will introduce a novelty approach using rates of change as the state-of-the-art method in a machine learning model to minimize input errors and provide early indicators in flood warning systems.

  • (3)

    To evaluate the performance indices of various models for establishing LSTM as the superior deep learning model that generates highly accurate point results.

  • (4)

    To achieve higher accuracy for the forecast of peak flow values.

The remainder of the article will be organized as follows: Section 2 will illustrate the model development, the performance indices picked for evaluating the model, the study area and model development. Section 3 will concentrate on the results. Section 4 will further discuss the findings. Finally, section 5 is the conclusion and recommendation for future study.

An overview of data management for artificial intelligence

Under this study, the data management that had been considered and shown in Figure 1 was the collection of raw historical data during the data acquisition stage, data cleaning and preprocessed data at the initial stage, ANN and LSTM at the data analysis & mining stage and lastly improved data quality at the data fusion stage.
Figure 1

A general overview of data treatment using data-driven models.

Figure 1

A general overview of data treatment using data-driven models.

Close modal

Abbasi et al. (2021) described the data-driven models as sensitive to input variables. These input variables are subjected to three major uncertainties: data, variable and model uncertainty. Data uncertainty refers to the poor quality of predictors. Variable uncertainty is defined as a poor selection of hyperparameters. A common approach to solve this concern is to introduce preprocessing solution where the techniques of dimensionality reduction and input variables selection are employed. Model uncertainty refers to the inability of the model to capture the actual physical processes. Due to these uncertainties, the selection of appropriate models is important in analyzing the different types of data. In this study, ANN and LSTM are chosen for this effort to process the time-series type of data.

Artificial neural network

Artificial intelligence (AI) is a common name that was commonly misinterpreted as ANN. ANN is one of the most basic and standard networks in the domain of AI. It has gained substantial attention and increasing popularity for data processing, supported by the growing consistent availability of multiple data variations due to improving technology. One of the ANN applications is to develop a hydrological model. ANN model can estimate, forecast, detect data patterns, optimize and establish relationships of complex variables (Afan et al. 2016). The origin of ANN can be traced back to biological neural networks that resemble how the human brain works.

ANN network architecture comprises at least three layers: the input, the hidden and the output. Input variables, such as the rainfall data, will be fed to the network at the input layer and appear as several neurons. The input layer is often manipulated to improve the deliverance of the modeling. One of the possible methods is to select the parameter that has a direct impact on the outcome of the model. The nodes will link to the next layer, with each link having a weight that represents the strength of the connection. Each node will go through a nonlinear transformation known as the activation function at the hidden layer before continuing to the output layer.

The ability to learn sets aside ANN compared to the conventional model. Figure 2 shows one of the most common ANN arrangements. A single-layer ANN is known as a perceptron. Hence, ANN is also known as the MLP for multiple layers. ANN's ability to extract accurate patterns between input and output variables has contributed meaningfully to hydrology and water resources (Kasiviswanathan et al. 2016). However, ANN has a drawback when fully retrieving information from time-series data. Data on the past states in the network cannot be retained during processing, and therefore information linked to the data sequence cannot be captured (Lin et al. 2021).
Figure 2

The arrangement of the feed-forward neural network.

Figure 2

The arrangement of the feed-forward neural network.

Close modal

A recurrent neural network (RNN) is a more advanced form of ANN model with an additional feature to memorize information, allowing progressive feedback interaction for generating better understanding simulation. Owing to its ability to handle consecutive inputs, RNN with a self-looped cycle helps store predecessor data outputs of a sequence to solve complex hydrological time series. However, the dynamic structure of RNN is subject to the vulnerability of vanishing or exploding gradient during the backpropagation progression, hampering the learning process. During training, errors can accumulate, leading to further complications (Fang & Yuan 2019). Another issue related to RNN is the long sequence dependency (Kilinc & Haznedar 2022). For solving the issues inherited in RNN, the LSTM network has become the way out.

Long short-term memory

LSTM is a deep learning network suitable for time-series data (Kim et al. 2022) that consists of a memory cell that can store information over an interval. The ability to store information is made possible with several nonlinear gate units that can regulate information movement into or out of the cell. LSTM network has proven to be more effective than RNN with its ability to keep or eliminate information and thus produce better forecast outputs that outperform FFNN (Ni et al. 2020). This capability gives an advantage to the LSTM, especially in the learning process associated with sequential data. Many benefits of LSTM, including the encoder–decoder building block, can be garnered (Yokoo et al. 2022). A staked multi-layer LSTM can be used for long-term forecasts by extracting sequential temporal information (Cai et al. 2020). To further improve the input sequences selection and semantics encryption in long-term memory, an attention mechanism can be introduced to the LSTM model (Li et al. 2019).

LSTM has a series of gates to control the information data flow in a cell. Depending on the gates, the data will be stored or eliminated from the network. Figure 3 shows the gates of LSTM used to manage information movement within the LSTM cell. There are three gates in a typical LSTM architecture; forget gate, input gate and output gate. The forget gate decides what information to be kept or discarded, where the information will be marked as ‘1’ to hold or ‘0’ to discard. The input gate will decide what information will be stored in the cell state. The output gate will manage the data from the cell state to feed into a new hidden state. The new hidden state will be updated based on the feed from the output gate. Eliminating one of them will notably damage the performance (Greff et al. 2017).
Figure 3

An overview of LSTM cell.

Figure 3

An overview of LSTM cell.

Close modal

Several simulation combinations are available for LSTM models (Ouyang et al. 2021). Kratzert et al. (2019) applied N-to-1 model to attain streamflow estimates. Under this model, a large sample of hydrological data per multi-time step time series of catchment parameters was attained as the input model. The output of the model was a one-step variable. The second combination was the N-to-M LSTM model or akin to the sequence-to-sequence model. Xiang et al. (2020) demonstrated the seq2seq LSTM learning model for rainfall-runoff to predict multi-time step streamflow. The third formulation is the N-to-N model, where Feng et al. (2020) applied hydrometeorological time series, catchment attributes and streamflow observations to enhance daily streamflow forecasting.

The new preprocessing approach

As model uncertainty is inherited in both ANN and LSTM networks, this study will introduce a new preprocessing approach involving rates of change to enhance AI modeling.

Currently, it can be noticed that all the research focuses only on the flow or water level as the output variables of the forecasted value based on the predetermined input parameters. To the authors' best knowledge, no literature review has mentioned the utilization with the rates of change () in AI time-series-based model development. The mathematical expression of a forecast flowrate is as follows:
(1)
where is the forecast flowrate; is the initial flowrate at the time, t; and is the change in flow.
Therefore, the rate of change is proposed to replace the conventional variables: the flow or water level. The rate of change can be derived as follows:
(2)
where is the rate of change; is the flowrate at current time t; is the initial flowrate at the previous time interval; t is the current time; and is the previous time interval.

By applying the rate of change (, fluctuation can be managed so that the model's accuracy can be improved further.

Model performance evaluation

Modeling conceptualizes the water movement behavior within a catchment by representing empirical idealization and simplification processes. Under this concept, there is a concern about the possibility of errors due to the vague understanding of complex parameter relationships. An effort is required to quantify the errors and objectively assess a model to justify the limitations by comparing the observed with the simulated variables (Liu 2020). For a model to be considered scientifically sound, vigorous and secure, it must undergo sensitivity analysis, model calibration and validation. Sensitivity analysis measures the reaction of a model in response to input parameters, while calibration identifies the differences between the observed and simulated flow. Validation ensures the model accuracy maintains for further extension of the application.

Model performance can be conducted in two ways, namely subjectively or objectively. Subjective assessment implies visual inspection for goodness-of-fit between actual and simulated for underestimation or overestimation or detecting the periodic pattern. Under the objective assessment, a mathematical approach for goodness-of-fit is applied where efficiency increases as the error between actual historical and simulated data decreases. Although there is no universal agreement on performance evaluation standards, it is generally accepted to consider multi-objective indicators (Ritter & Muñoz-carpena 2013). It is an excellent practice to review efficiency levels through dimensionless statistics, absolute error-index and graphical representation (Waseem et al. 2008). The idea of goodness-of-fit is to examine the model's capability to reproduce historical patterns, forecast future events and quantify the improvements with multiple modeling approaches (Althoff & Rodrigues 2021). Optimally, it is crucial to consider both absolute and relative errors in model performance assessment (Jackson et al. 2019).

Under this study, five performance indexes will be assessed to determine the developed models' goodness of fit. These evaluation methods, which take into consideration of the absolute and relative aspects of the errors, are the root-mean-square error (RMSE), mean absolute error (MAE), correlation coefficient (R), Nash–Sutcliffe efficiency (NSE) and peak flow criteria (PFC)/low-flow criteria (LFC).

Root-mean-square error

RMSE is handy because it requires the exact unit measurement. However, caution is necessary as it is not sensitive to linear offsets between the observed and simulated values that can result in low RMSE values even in significantly different variables condition (Jackson et al. 2019).
(3)
where are the observed values of the criterion; are the simulated values of the criterion; n is the sample size.

Mean absolute error

MAE is a well-recognized absolute error goodness-of-fit indicator that measures the significance of average error in a model. It is unsuitable for comparing two catchments with different hydrological characteristics (Althoff & Rodrigues 2021).

The mathematical representation of MAE is as follows:
(4)
where are the observed values of the criterion; are the simulated values of the criterion; n is the sample size.

Pearson correlation coefficient, R

R is a standard regression criterion that describes the fraction of total variance. The range of the R-value lies between 0 and 1, where 1 is the perfect fit.

The mathematical expression of the Pearson correlation coefficient is as follows:
(5)
where are the observed values of the criterion; are the simulated values of the criterion; is the mean of the observed values; is the mean of the simulated values; n is the sample size.

The value of R with value one signifies better agreement, while the value of zero signifies no co-relation. R has a limitation on measuring only the linear relationship between variables. Like RMSE, R is indifferent to proportional differences between the observed and simulated values (Daren Harmel & Smith 2007). Therefore, achieving a higher R-value is possible even though the gap between the observed and simulated values varies significantly. The correlation is also sensitive to outliers that can result in bias during model assessment.

Nash–Sutcliffe efficiency

NSE is the most widely applied goodness-of-fit measurement. It measures the absolute difference between the observed and predicted, followed by normalization with a variance of observed values to remove bias. The range of NSE lies between 1 and −∞, where 1 is the perfect fit. The model accuracy can be translated into very good for 0.75 < NSE ≤ 1, good for 0.65 < NSE ≤ 0.75, satisfactory for 0.50 < NSE ≤ 0.65 or unsatisfactory for NSE ≤ 0.50 (Kim et al. 2021).

The mathematical representation of NSE is as follows:
(6)
where Yi is the measured value of the criterion; Yt is the predicted value of the criterion variable (dependent) variable Y; is the mean of the measured values of Y; n is the sample size.

The numerator reflects on possible minor errors becoming smaller, while the opposite is true for more significant errors. This trait has become the limitation that can lead to overestimation or underestimation of the model under assessment. Another concern is the existence of a measured mean implies catchment with high variable values will lead to an overestimation of efficiency.

Peak flow criteria and LFC

PFC and LFC measure the error of the hydrological model to forecast peak flow values and low-flow values, respectively. A zero PFC and LFC signify a perfect model. This condition exists when no peak flow exceeds one-third of the actual mean peak flow. A similar situation applies to LFC.

The mathematical representation of PFC is as follows:
(7)
where n is the number of data points; x is the observed data points; y is the predicted data points; and is the number of peak flows greater than one-third of the observed mean peak value.
The mathematical representation of LFC is as follows:
(8)
where n is the number of data points; x is the observed data points; y is the predicted data points; is the number of low flow lower than one-third of the observed mean low value.

Mean-squared error

Mean-squared error (MSE) measures the amount of error in a statistical model. It gauges the average squared difference between the observed and predicted values. When MSE equals zero, the model does not have an error. As errors increase, MSE values will increase too.

The mathematical representation of MSE is as follows:
(9)
where is the observed value; is the predicted value; n is the number of observations.

Study area and data description

Kuala Lumpur is the capital city of Malaysia, which is situated in Southeast Asia. The main river in the city is the Klang River, with a catchment area of 1,288 km2 and flowing through a distance of 120 km (Zabidi et al. 2011). Eleven major tributaries are attached to the river, which flows across Selangor State and the Kuala Lumpur Federal Territory (Othman et al. 2020). The weather of Malaysia is affected mainly by two dominant monsoon seasons, i.e., North-East Monsoon from November to February and South-West Monsoon from May to August. The city receives an average annual rainfall of 2,600 mm. Batu, Gombak and upper Klang Rivers are the tributaries of the Klang River in the upper catchment of Kuala Lumpur (Hanna et al. 2020). Amid massive development and coupled with frequent short and intense duration precipitation, Kuala Lumpur is vulnerable to flash flood occurrence. This setting is particularly true at the confluence of the Klang River and Gombak River, where the well-known historical tourist area of Masjid Jamek is located.

Through the Department of Irrigation and Drainage, the Government of Malaysia initiated a comprehensive study to develop a Kuala Lumpur Flood Mitigation Plan (KLFM) (Kim-soon et al. 2016). The project is aimed to solve urban flooding issues in Kuala Lumpur city center through engineering structural measures. Under this plan, the Stormwater Management and Road Tunnel (SMART) project was constructed, part of several packages' early commissioning. Due to the significant contribution of the SMART project (refer to Figure 4) in mitigating the urban flood in the city center, the confluence of the Klang River and Ampang River was chosen for this study, where the beginning of river flow diversion would be initiated whenever there was a high rainfall incidence at the upper catchment.
Figure 4

Map of study location and rain gauge sites.

Figure 4

Map of study location and rain gauge sites.

Close modal

SMART operating procedure

The SMART project is exclusive as it is the only one in the world that combines wet and dry systems. Hydrologically, it manages excess stormwater from the upper catchment through flow diversion and a few storages. A total storage of 3 million m3 is available considering the volume of the holding pond, tunnel and attenuation pond. The operation of the SMART system is based on the flow at the Klang River and Ampang River confluence. During a typical day when there is no major storm, the tunnel is operated under mode 1, as shown in Figure 5. River flows at the confluence is approximately between 7 and 15 m3/s. When a storm increases the flow at the confluence, reaching 70 m3/s, mode 2 will be executed where the river flows to the city center and will be diverted into the Berembang holding pond. The tunnel at that moment is still available to facilitate motorway traffic. Excess water from the holding pond will flow through the tunnel at the lowest point of the three-compartment levels. When the storm becomes more intense, and the flow at the confluence reaches up to 150 m3/s, mode 3 will be executed. The traffic at the upper two compartments of the tunnel will be evacuated within 1 h. If the intense storm continues, mode 4 will be performed, and the whole compartments of the tunnel will be available for stormwater diversion.
Figure 5

SMART standard operating procedures.

Figure 5

SMART standard operating procedures.

Close modal

For this reason, another consideration for selecting this study area is the potential risk as the stormwater will be diverted into the tunnel, which has dual functionality primarily for the traffic motorway during regular days. This circumstance warrants high accuracy of river flow forecast at the confluence for flood mitigation purposes and traffic evacuation from the tunnel. The selection of the study location can demonstrate the significance of machine learning as the engineering nonstructural measure in flood mitigation and early warning system.

Study data

The SMART catchment of 160 km2 is equipped with 28 numbers of hydrological stations. Each station has a combination of rain gauge and Doppler current meter (DCM). Some site has water level sensor in place of DCM. Data on rainfall and flow are collected and constantly transmitted to the main server at the control center through telemetry. However, there are 28 stations and only rainfall data from 11 telemetry stations are selected for this study. The rest of the surrounding stations are for observation and do not contribute to the flow at the Klang River and Ampang River confluence. The historical 30 min of rainfall data interval is collected from January 2008 to August 2021. A data interval of more than 30 min is discouraged, as a major storm can change drastically within that period. Likewise, the data interval of fewer than 30 min is also prevented as a substantial volume of data can cause learning difficulty in a model and delay the forecast.

The design of the data analysis procedure

An architecture of optimum model needs to be introduced to attain a robust deep learning model for river streamflow forecasting by employing available data to establish a relationship between the predictors and the model. This study used MATLAB to do the scripting for AI models. Rainfall data from 11 telemetry stations along the upper catchment of the Klang River basin with intervals of 30 min from 1 January 2008 to August 2021 were obtained and presented as the model input data. The model will forecast the flow at the Ampang–Klang confluence, which was the point of interest of this study. The FFNN was selected as the first model. Figure 6 shows an overview of the steps taken when running the proposed models.
Figure 6

The flowchart for the design of the data analysis procedure.

Figure 6

The flowchart for the design of the data analysis procedure.

Close modal

This section highlighted the results attained during the training of various models for streamflow forecasting. The model training was conducted using the FFNN and LSTM models. At the initial stage, analysis was done without any preprocessing process. This procedure was later followed by introducing a suitable preprocessing step to improve the forecast. The results' assessment was elaborated based on different input parameters and target variables, as mentioned in the following subsections.

Input parameters consist of 11 stations, and target variables are flows at the confluence

Precipitation data of 11 stations at the upper catchment of SMART watershed with intervals of 30 min were used as training input for FFNN and LSTM to predict flows at the confluence of Ampang River and Klang River. For the FFNN model, analysis was carried out with one hidden layer and extended to two hidden layers. The number of neurons varied from 2 to 23 for each hidden layer. The trained models yielded different accuracy under different combinations of algorithms. The model's best accuracy was obtained with 1 hidden layer and 10 neurons with an overall R of 0.4465, MAE of 3.7135, NSE of 0.1994, RMSE of 8.856, PFC of 0.1843 and LFC of 1.1376 as shown in Table 1. However, this result is not satisfactory for flood warning purposes.

Table 1

Performance of MLP model having one hidden layer

NeuronRMAENSERMSEPFCLFC
0.3181 3.8922 0.0943 9.4185 0.1871 0.9528 
0.3856 3.8002 0.1442 9.1555 0.1842 0.8962 
0.4013 3.7773 0.1598 9.0719 0.1832 0.8683 
0.1766 3.7758 −0.7598 13.1288 0.1828 0.9993 
0.4273 3.7615 0.1825 8.9483 0.1837 1.0582 
0.4028 3.8242 0.1622 9.0585 0.1867 1.2367 
0.4313 3.7432 0.1860 8.9294 0.1855 1.1430 
0.4437 3.7516 0.1965 8.8713 0.1825 1.0417 
10 0.4465 3.7135 0.1994 8.8556 0.1843 1.1376 
11 0.4408 3.6478 0.1939 88,855 0.1851 0.9001 
12 0.4387 3.7405 0.1925 8.8936 0.1839 1.1993 
13 0.4194 3.7628 0.1758 8.9848 0.1879 1.1710 
14 0.3567 3.6880 0.0864 9.4596 0.1850 0.9835 
15 0.4134 3.8092 0.1709 9.0116 0.1877 1.0525 
16 0.4386 3.6843 0.1916 8.8983 0.1861 1.0109 
17 0.4182 3.8120 0.1746 8.9912 0.1855 1.1959 
18 0.3784 3.8921 0.1431 9.1616 0.1907 1.1369 
19 0.4310 3.7793 0.1857 8.9307 0.9268 1.0838 
20 0.3995 3.8754 0.1595 9.0733 0.1833 1.0266 
21 0.4094 3.8598 0.1676 9.0297 0.1843 1.1560 
22 0.3862 3.8615 0.1491 9.1292 0.1894 1.1081 
23 0.3862 3.8256 0.1491 9.1292 0.1899 1.0898 
NeuronRMAENSERMSEPFCLFC
0.3181 3.8922 0.0943 9.4185 0.1871 0.9528 
0.3856 3.8002 0.1442 9.1555 0.1842 0.8962 
0.4013 3.7773 0.1598 9.0719 0.1832 0.8683 
0.1766 3.7758 −0.7598 13.1288 0.1828 0.9993 
0.4273 3.7615 0.1825 8.9483 0.1837 1.0582 
0.4028 3.8242 0.1622 9.0585 0.1867 1.2367 
0.4313 3.7432 0.1860 8.9294 0.1855 1.1430 
0.4437 3.7516 0.1965 8.8713 0.1825 1.0417 
10 0.4465 3.7135 0.1994 8.8556 0.1843 1.1376 
11 0.4408 3.6478 0.1939 88,855 0.1851 0.9001 
12 0.4387 3.7405 0.1925 8.8936 0.1839 1.1993 
13 0.4194 3.7628 0.1758 8.9848 0.1879 1.1710 
14 0.3567 3.6880 0.0864 9.4596 0.1850 0.9835 
15 0.4134 3.8092 0.1709 9.0116 0.1877 1.0525 
16 0.4386 3.6843 0.1916 8.8983 0.1861 1.0109 
17 0.4182 3.8120 0.1746 8.9912 0.1855 1.1959 
18 0.3784 3.8921 0.1431 9.1616 0.1907 1.1369 
19 0.4310 3.7793 0.1857 8.9307 0.9268 1.0838 
20 0.3995 3.8754 0.1595 9.0733 0.1833 1.0266 
21 0.4094 3.8598 0.1676 9.0297 0.1843 1.1560 
22 0.3862 3.8615 0.1491 9.1292 0.1894 1.1081 
23 0.3862 3.8256 0.1491 9.1292 0.1899 1.0898 

The bold values signify the best performance forecast.

Further training and validation were then performed using the LSTM model in MATLAB with the same input parameters and target variables. The performance of the results obtained from the LSTM model was measured and shown in Table 2. The result of LSTM was satisfactory in comparison to the FFNN model.

Table 2

Performance of LSTM model for streamflow simulation

ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.9055 0.8586 17.8532 28.8315 1.4365 2.4208 0.8190 5.3695 
ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.9055 0.8586 17.8532 28.8315 1.4365 2.4208 0.8190 5.3695 

The training progress was plotted as shown in Figure 7. It could be observed that the RMSE and losses had reached the minimum value within 250 iterations. As for the peak-to-peak values of the forecast and observed flows, as in Figure 8, it could be regarded that most of the outputs were under-forecast.
Figure 7

Training progress for model simulation.

Figure 7

Training progress for model simulation.

Close modal
Figure 8

Plotting output of graph flow target vs. flow output for model simulation.

Figure 8

Plotting output of graph flow target vs. flow output for model simulation.

Close modal

Input parameters consist of 11 stations and flows Qt and target variables are flows, Qt+0.5 at the confluence

In the case of section 3.1, it was observed that the input data were repetitive in values, but the target values were inconsistent at different times. For this reason, the training efficiency has been reduced. To overcome this problem, Qt was introduced as part of the input data besides the 11 existing rainfall stations data. The target value was Qt+0.5. Hence, this model was to forecast 30 min ahead of streamflow at the confluence. From the results in Table 3, it could be observed that the best output model (R = 0.9359, MAE = 0.7722, NSE = 0.8756, RMSE = 3.4911, PFC = 0.1294 and LFC = 0.9057) was corresponding to two hidden layers with 17 and 8 neurons in the first hidden layer and second hidden layer, respectively. However, there was a trade-off in increased epoch time when the hidden layers were added from single to double layers. Nevertheless, the results improved significantly after introducing Qt into the input parameters.

Table 3

Performance of MLP model having two hidden layers

Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
17 0.9267 0.7101 0.8588 3.7189 0.1297 0.5762 
17 0.9229 0.9113 0.8512 3.8182 0.1300 0.8914 
17 0.9341 0.7148 0.8725 3.5345 0.1242 0.6249 
17 0.9356 0.7263 0.8752 3.4957 0.1253 0.5812 
17 0.9293 0.7514 0.8636 3.6553 0.1257 0.5558 
17 0.9166 0.8410 0.8400 3.9593 0.1321 0.5651 
17 8 0.9359 0.7722 0.8756 3.4911 0.1294 0.9057 
17 0.9321 0.7276 0.8687 3.5863 0.1269 0.5715 
18 0.9238 0.7273 0.8534 3.7897 0.1311 0.7246 
18 0.9274 0.7152 0.8600 3.7028 0.1385 0.6902 
18 0.9178 0.7543 0.8424 3.9290 0.1318 0.6976 
18 0.9315 0.7712 0.8676 3.6012 0.1259 0.5893 
18 0.9283 0.7854 0.8616 3.6822 0.1290 0.6935 
18 0.9054 0.9199 0.8196 4.2035 0.1297 0.5684 
18 0.9281 1.0280 0.8580 3.7293 0.1252 0.5649 
18 0.9288 0.8382 0.8625 3.6697 0.1240 0.6729 
Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
17 0.9267 0.7101 0.8588 3.7189 0.1297 0.5762 
17 0.9229 0.9113 0.8512 3.8182 0.1300 0.8914 
17 0.9341 0.7148 0.8725 3.5345 0.1242 0.6249 
17 0.9356 0.7263 0.8752 3.4957 0.1253 0.5812 
17 0.9293 0.7514 0.8636 3.6553 0.1257 0.5558 
17 0.9166 0.8410 0.8400 3.9593 0.1321 0.5651 
17 8 0.9359 0.7722 0.8756 3.4911 0.1294 0.9057 
17 0.9321 0.7276 0.8687 3.5863 0.1269 0.5715 
18 0.9238 0.7273 0.8534 3.7897 0.1311 0.7246 
18 0.9274 0.7152 0.8600 3.7028 0.1385 0.6902 
18 0.9178 0.7543 0.8424 3.9290 0.1318 0.6976 
18 0.9315 0.7712 0.8676 3.6012 0.1259 0.5893 
18 0.9283 0.7854 0.8616 3.6822 0.1290 0.6935 
18 0.9054 0.9199 0.8196 4.2035 0.1297 0.5684 
18 0.9281 1.0280 0.8580 3.7293 0.1252 0.5649 
18 0.9288 0.8382 0.8625 3.6697 0.1240 0.6729 

The bold values signify the best performance forecast.

The training and validation processes were then performed using LSTM with the same input parameters and target variables. The result of the LSTM model is shown in Table 4. There was a significant improvement to the LSTM model performance, with an increase of 5% for training regression and 10% for validation regression. NSE also improved from 0.8190 to 0.8963.

Table 4

Performance of LSTM model for 30 min ahead streamflow

ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.9470 0.9476 10.2326 10.0042 0.5640 0.6935 0.8963 3.1629 
ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.9470 0.9476 10.2326 10.0042 0.5640 0.6935 0.8963 3.1629 

The training progress was plotted 30 min ahead forecast. As for the peak-to-peak values of the forecast and observed flows, as in Figure 9, it could be observed that most of the forecasts were accurate and better than Figure 8.
Figure 9

Plotting of flow target vs. flow output for model 30-min ahead of forecast.

Figure 9

Plotting of flow target vs. flow output for model 30-min ahead of forecast.

Close modal

Input parameters consist of 11 stations and flows Qt and target variables are flows, Qt+1 at the confluence

The training was then performed 1 h ahead of forecast. This step was to test the possibility of achieving higher accuracy or vice versa when forecast time increased. Likewise, the input parameters were maintained as 11 stations of rainfall data with intervals of 30-min and flow Qt akin to section 3.2; the target variables were replaced with Qt+1, which were 1 h ahead flows at the confluence of Ampang River and Klang River. From Table 5, the best result came from two hidden layers corresponding to 14 and 5 neurons for each hidden layer, respectively. The R = 0.8671, MAE = 1.1305, NSE = 0.7515, RMSE = 4.9333, PFC = 0.1375 and LFC = 0.8509. When compared to section 3.2, it could be observed that the overall performance started to deteriorate as the forecast time increased progressively.

Table 5

Performance of MLP model having two hidden layers for 1-h streamflow forecast

Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
14 0.8472 1.1499 0.7177 5.2584 0.1566 1.2703 
14 0.8536 1.2050 0.7286 5.1559 0.1510 1.3151 
14 0.8542 1.1664 0.7294 5.1482 0.1432 1.4318 
14 5 0.8671 1.1305 0.7515 4.9333 0.1375 0.8509 
14 0.8457 1.6626 0.7127 5.3051 0.1536 0.9052 
14 0.8436 1.4802 0.7117 5.3137 0.1532 0.9359 
15 0.8495 1.3574 0.7216 5.2219 0.1537 0.8250 
15 0.8441 1.2273 0.7116 5.3149 0.1583 1.3876 
15 0.8525 1.2329 0.7265 5.1761 0.1497 0.8098 
15 0.8524 1.1716 0.7262 5.1782 0.1493 1.3803 
15 0.8464 1.3504 0.7140 5.2927 0.1450 0.7842 
15 0.8493 1.1920 0.7213 5.2244 0.1600 0.9061 
15 0.8507 1.3035 0.7237 5.2018 0.1485 0.9080 
16 0.8448 1.4396 0.7136 5.2966 0.1541 0.7507 
16 0.7987 1.4039 0.6379 5.9554 0.1690 0.8211 
16 0.8563 1.2894 0.7328 5.1154 0.1434 0.7757 
Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
14 0.8472 1.1499 0.7177 5.2584 0.1566 1.2703 
14 0.8536 1.2050 0.7286 5.1559 0.1510 1.3151 
14 0.8542 1.1664 0.7294 5.1482 0.1432 1.4318 
14 5 0.8671 1.1305 0.7515 4.9333 0.1375 0.8509 
14 0.8457 1.6626 0.7127 5.3051 0.1536 0.9052 
14 0.8436 1.4802 0.7117 5.3137 0.1532 0.9359 
15 0.8495 1.3574 0.7216 5.2219 0.1537 0.8250 
15 0.8441 1.2273 0.7116 5.3149 0.1583 1.3876 
15 0.8525 1.2329 0.7265 5.1761 0.1497 0.8098 
15 0.8524 1.1716 0.7262 5.1782 0.1493 1.3803 
15 0.8464 1.3504 0.7140 5.2927 0.1450 0.7842 
15 0.8493 1.1920 0.7213 5.2244 0.1600 0.9061 
15 0.8507 1.3035 0.7237 5.2018 0.1485 0.9080 
16 0.8448 1.4396 0.7136 5.2966 0.1541 0.7507 
16 0.7987 1.4039 0.6379 5.9554 0.1690 0.8211 
16 0.8563 1.2894 0.7328 5.1154 0.1434 0.7757 

The bold values signify the best performance forecast.

Similarly, further training and validation were performed using the LSTM model with the same input parameters and target variables. The result of the LSTM model is shown in Table 6.

Table 6

Performance of LSTM model for streamflow 1-h ahead

ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.8849 0.8677 21.4296 25.0978 0.9397 1.2829 0.7828 5.0098 
ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.8849 0.8677 21.4296 25.0978 0.9397 1.2829 0.7828 5.0098 

The training progress was then plotted for a 1-h ahead forecast. As for the peak-to-peak values of the forecast and observed flows, as in Figure 10, most of the forecasts were accurate but not as good as in Figure 9.
Figure 10

Plotting output for graph flow target vs. flow output for model 1-h ahead forecast.

Figure 10

Plotting output for graph flow target vs. flow output for model 1-h ahead forecast.

Close modal

Input parameters consist of 11 stations and flow Qt and target variables are rates of change

The second part of the training involved the rate of change as part of the development to see the potential to achieve higher accuracy in the forecast or vice versa. The input parameters were maintained as 11 stations of rainfall data with intervals of 30 min and flow Qt. At the same time, the target variable changed to rates of change derived from Qt+0.5Qt divided by the time interval (refer to Equation (2)). The best results were yielded at two hidden layers with neurons of 23 and 10 for each layer, respectively. The best performance with R = 0.7303, MAE = 0.0263, NSE = 0.5334 and RMSE = 0.1297, as shown in Table 7, had a longer epoch time of 0:05:40 at 116 iterations when compared with a single hidden layer of 15 neurons where R = 0.7129, MAE = 0.0279, NSE = 0.5073 and RMSE = 0.1333 with epoch time 0:00:53 with 83 iterations. There was no significant increase in accuracy, although there was a trade-off regarding the computing speed.

Table 7

Performance of MLP model having two hidden layers and target

Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
22 0.6984 0.0276 0.4878 0.1359 0.1632 0.3525 
22 0.7076 0.0269 0.5007 0.1342 0.1706 0.3435 
22 0.7028 0.0277 0.4934 0.1352 0.1501 0.3409 
22 10 0.7050 0.0267 0.4967 0.1347 0.1632 0.3909 
22 11 0.6771 0.0277 0.4585 0.1397 0.1891 0.3944 
23 0.7131 0.0274 0.508 0.1332 0.1482 0.4243 
23 0.7182 0.0274 0.5156 0.1322 0.1384 0.4088 
23 0.6952 0.0288 0.4816 0.1367 0.1753 0.3819 
23 0.7058 0.0278 0.4979 0.1346 0.1432 0.4096 
23 0.6809 0.0292 0.4636 0.1391 0.1757 0.3726 
23 0.6948 0.0282 0.4826 0.1366 0.1560 0.4055 
23 0.7051 0.0278 0.4961 0.1348 0.1437 0.4162 
23 0.6782 0.0286 0.4600 0.1396 0.1824 0.3970 
23 10 0.7307 0.0263 0.5334 0.1297 0.1386 0.3936 
23 11 0.6624 0.0306 0.4388 0.1423 0.1851 0.3883 
23 12 0.7042 0.0268 0.4956 0.1349 0.1522 0.4068 
Neuron first layerNeuron second layerRMAENSERMSEPFCLFC
22 0.6984 0.0276 0.4878 0.1359 0.1632 0.3525 
22 0.7076 0.0269 0.5007 0.1342 0.1706 0.3435 
22 0.7028 0.0277 0.4934 0.1352 0.1501 0.3409 
22 10 0.7050 0.0267 0.4967 0.1347 0.1632 0.3909 
22 11 0.6771 0.0277 0.4585 0.1397 0.1891 0.3944 
23 0.7131 0.0274 0.508 0.1332 0.1482 0.4243 
23 0.7182 0.0274 0.5156 0.1322 0.1384 0.4088 
23 0.6952 0.0288 0.4816 0.1367 0.1753 0.3819 
23 0.7058 0.0278 0.4979 0.1346 0.1432 0.4096 
23 0.6809 0.0292 0.4636 0.1391 0.1757 0.3726 
23 0.6948 0.0282 0.4826 0.1366 0.1560 0.4055 
23 0.7051 0.0278 0.4961 0.1348 0.1437 0.4162 
23 0.6782 0.0286 0.4600 0.1396 0.1824 0.3970 
23 10 0.7307 0.0263 0.5334 0.1297 0.1386 0.3936 
23 11 0.6624 0.0306 0.4388 0.1423 0.1851 0.3883 
23 12 0.7042 0.0268 0.4956 0.1349 0.1522 0.4068 

The bold values signify the best performance forecast.

The LSTM model was then executed with the same input parameters and target variables to check on the model testing and validation performance accuracy. The result of the LSTM model is shown in Table 8.

Table 8

Performance of LSTM model for

ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.8573 0.6993 0.0010 0.0174 0.0181 0.0270 0.7349 0.1319 
ModelR.TrainR.ValidMSE.TrainMSE.ValidMAE.TrainMAE.ValidNSERMSE
LSTM 0.8573 0.6993 0.0010 0.0174 0.0181 0.0270 0.7349 0.1319 

The training progress was plotted in Figure 11 for the rates of change forecast. It could be observed that the RMSE and losses had reached the minimum value at the 350 iterations.
Figure 11

Training progress for .

Figure 11

Training progress for .

Close modal
The rates of change were then converted to predicted flow, and a graph of actual flow vs. the predicted flow at the confluence was plotted as in Figure 12.
Figure 12

Plotting of actual flow vs. predicted flow derived from model .

Figure 12

Plotting of actual flow vs. predicted flow derived from model .

Close modal

This section will further illustrate the results obtained earlier. The study was conducted in two parts. The first part was to evaluate prediction performance with the MLP networks. The algorithm of the MLP networks varied from one hidden layer to two hidden layers. The number of neurons started from 2 and increased to a maximum of 23 for a single hidden layer network. While in the case of two hidden layers, the neurons started from 2 and increased to a maximum of half of the total neurons in the first layer. The results showed better yield as the layer increased from one to two. However, as more layers were added, the results did not reflect better yield. These results were due to the approximation of the ideal function as indicated in the universal approximation theorem (Tang & Yang 2021). There was an optimal number of layers and nodes in the networks to avoid overfitting or vice versa. This procedure would determine the best architecture structure to deliver the best results according to various inputs and target values, as mentioned in sections 3.1–3.3. The predicted and original streamflow was then compared to measure the precision of the forecast results. LSTM networks followed this process for the same input parameters and target values. This first part of the study aims to find the best performance networks under different criteria and compare the achievement between the MLP and LSTM networks. The second part of the study focused on implementing rates of change as a novelty approach and understanding its relationship with the purpose of flood mitigation.

For the stated condition in section 3.1, it could be concluded that the best MLP structure was 1 layer with 10 neurons. Table 1 showed that the results yielded R = 0.4465, MAE = 3.7135, NSE = 0.1994, RMSE = 8.856, PFC = 0.1843 and LFC = 1.1376. The divergence reached within 1:19 min at 190 epochs. As the regression and NSE values were far from reaching value 1, it indicated poor-quality input data and poor-quality forecast of streamflow at the confluence of Klang River and Ampang River. Due to maintenance testing, poor-quality input data could be caused by weak communication signals or data distortion. Similarly, MAE and RMSE were far from 0 value which denoted the forecasts were far from the line of best fit. The model would have few accurate values when tested out of the sample. When comparing the PFC value with the LFC value, the PFC value reflected a better forecast for peak flow than the low-flow forecast. Peak flow was the extreme value of the streamflow series and was typically difficult to model.

As further training and validation were conducted using the LSTM model to simulate streamflow, the deep learning produced significantly better results with R = 0.9055, MSE = 17.8532, MAE = 1.4365, NSE = 0.8190 and RMSE = 5.3695. A better result than the MLP model was expected as the LSTM architecture could remember essential data, making it a superior model. A single forward pass was involved as the training went through the MLP model. However, the operation would have involved three single forward passes in the LSTM network. The memory cells in the LSTM model, as shown in Figure 3, helped to store the final output and used it as the input for the successive step.

For section 3.2, the training was improved by introducing the preprocessing step. The result of the MLP model was very encouraging with R = 0.9359, MAE = 0.7722, NSE = 0.8756, RMSE = 3.4911, PFC = 0.1294 and LFC = 0.9057 for 2 hidden layers MLP with 17 neurons and 8 neurons, respectively. This improvement was made possible as the issue of input parameters redundancy with inconsistent target values was solved after introducing the flow Qt into the parameter inputs. The target values of Qt+0.5 denoted the forecast for streamflow 30 min ahead. The value of R and NSE close to 1 indicated higher precision. It could be observed that MAE and RMSE values were also reduced by almost half of the values acquired in section 3.1. These results implied better forecast accuracy for out-of-sample data. PFC value was nearing 0, while the LFC value was almost 1. This value suggested a better forecast for peak flow than low flow, as shown in Figure 9.

The training then proceeded with the LSTM model. The results generated R = 0.9470, MSE = 10.2326, MAE = 0.5640, NSE = 0.8963 and RMSE = 3.1629, as shown in Table 4. And again, a better result than the MLP network was achieved for the LSTM model, consistent with the previous studies. The overall result indicated LSTM model could perform better than MLP and, therefore, a better forecasting model.

Since this was a multi-step ahead study, the analysis was continued with a 1-h ahead streamflow forecast as mentioned in section 3.3. This continuation would allow us to track the trend of the forecast. The MLP yielded the best performance with R = 0.8671, MAE = 1.1305, NSE = 0.7515, RMSE = 4.9333, PFC = 0.1375 and LFC = 0.8509 for 2 hidden layers with neurons of 14 and 5 each layer, respectively, as shown in Table 5. Therefore, there was a drop in performance 1 h ahead MLP forecast. However, the result was still within an acceptable range for an hour ahead of the forecast. A comparable trend was also noticed in the LSTM model that had attained R = 0.8849, MSE = 21.4296, MAE = 0.9397, NSE = 0.7828 and RMSE = 5.0098, as shown in Table 6. Figure 10 revealed an acceptable peak flow forecast, although less accurate than Figure 9. Therefore, this trend was per the theory that the shorter the forward time, the better the forecast would be.

The second part of this study will evaluate the application of the as the target variables in the networks. For the MLP model, the best performance yielded R = 0.7303, MAE = 0.0263, NSE = 0.5334, RMSE = 0.1297, PFC = 0.1386 and LFC = 0.3936 for 2 hidden layers of 23 neurons and 10 neurons, respectively, as shown in Table 7. It could be seen that the performance was poorer compared to models in sections 3.2–3.3. This yield was reflected in lower R and NSE values. However, the MAE and RMSE values were nearer to 0. As the was the subtraction value of two consecutive steps of flows with a division of time interval as mentioned in section 2.3, the potential errors between the steps had been minimized and therefore significantly reduced the MAE and RMSE values. The lower PFC value compared to LFC reflected a better peak flow forecast than the low flow for this model, as plotted in Figure 12. It could be noted that being the lowest LFC value among all models implied a better forecast of low flow in this model.

On the other hand, the results from the LSTM model were superior with which yielded significantly better results with R = 0.8573, MSE = 0.0010, MAE = 0.0181, NSE = 0.7349 and RMSE = 0.1319 as shown in Table 8. The performance was similar to or much better than the streamflow 1-h ahead training outcome in Table 6 of section 3.3. Like the MLP, the MSE and MAE values were also almost reaching 0, suggesting a better forecast for out-of-sample data forecasting.

The forecasted rates of change were then translated to forecasted flow. A graph of actual flow at the confluence vs. forecasted flow derived from rates of change was plotted. Figure 12 showed very high forecasting accuracy and minimal under-forecast compared with the previous output.

Given the results obtained in this study, it could be established that was suitable as an element to detect an early change in streamflow movement. The classification of streamflow variations, such as normal, alert and danger, could be deployed for this reason. This discovery would allow the water managers additional time to take advance precautionary steps to mitigate flood events. Nevertheless, this study was the first effort to utilize in machine learning. Further improvement by applying other techniques in terms of data preprocessing or data mining procedures could be considered in the subsequent study.

Flood forecasting system requires high accuracy to support the water managers in their daily operations, such as handling hydraulic gates to divert river streamflow to ensure and safeguard public safety. This regional case study is to develop the best deep learning model with suitable hyperparameters for the SMART control center in Kuala Lumpur, Malaysia. This study aims to present a novel technique by introducing rates of change in the machine learning models. This study will perform a simulation and deliver a multi-step ahead streamflow forecast.

Several models of ANN were employed to develop and train for the flow forecast at the confluence of the Ampang River and Klang River. The first part of the simulation stage yielded a poor result, possibly below par, caused by a redundancy of input parameters with different target values or the multi-finality issue. However, the LSTM model displayed a significantly better result than ANN, with a 45% improvement in the regression value from 0.4465 to 0.9055. This improvement aligned with Ni et al.'s statement that the LSTM network could deliver better results than ANN due to its memory cells.

Qt was introduced into the model as the input parameter to solve the redundancy issue, while Qt + 0.5 was the target variable. This inclusion significantly improved the model performance with values of R = 0.9359, MAE = 0.7722, NSE = 0.8756 and RMSE = 3.4911 for the ANN model. The model was equivalent to making a 30-min ahead forecast. The LSTM model still yielded better results than the ANN model.

The following experiment was to perform an hour ahead of the forecast. Both ANN and LSTM generated less accurate performance than the 30-min ahead forecast. It was confirmed that the Lv et al. (2020) statement suggested the performance model deterioration when the forecast lead time increased, as shown in section 3.3.

As for the core of this study, the rates of change were introduced into the model as the target variables. The R and NSE were unsatisfactory for flood mitigation operations, but the MAE and RMSE values were reduced to near 0. Positive and negative values of variables could be detected as the output results. The forecasted rates of change were then translated to the forecasted flow. The result was awe-inspiring when a graph of observed flow vs. forecasted flow was plotted, as shown in Figure 12. The rates of change could be used as early detection to track a flow pattern change that could assist the water managers in staying alert.

In summary, these findings confirmed the previous theoretical understandings concerning the model performance and led to the discovery of an additional layer of protection in flood mitigation operations. It was beneficial for being the novice step to understand rates of change behavior in model development and could be explored further. The limitation of this study had been restricted to the use of ANN and LSTM models. It was suggested to apply evolutionary computing methods, such as nature-inspired optimization models, for future studies to generate higher-quality data for better outcomes. Other features, such as land condition and evaporation rate, could be considered additional input parameters to provide more information, as hydrological is a highly complex process.

All authors contributed to the study and design. W. Y. T. performed material preparation, data collection, investigation, data curation and analysis. S. H. L. contributed to conceptualization, methodology, software teaching and supervision; K. P. contributed to the software development; F. Y. T. contributed to supervision; A. E.-S. contributed to review, supervision and project administration. W. Y. T. wrote the first draft of the manuscript, and all authors commented on previous versions. All authors have read, approved the final manuscript and agreed to the published version of the manuscript.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abbasi
M.
,
Farokhnia
A.
,
Bahreinimotlagh
M.
&
Roozbahani
R.
2021
A hybrid of random forest and deep auto-encoder with support vector regression methods for accuracy improvement and uncertainty reduction of long-term streamflow prediction
.
Journal of Hydrology
597
(
March 2020
),
125717
.
https://doi.org/10.1016/j.jhydrol.2020.125717
.
Adikari
K. E.
,
Shrestha
S.
,
Ratnayake
D. T.
,
Budhathoki
A.
,
Mohanasundaram
S.
&
Dailey
M. N.
2021
Evaluation of artificial intelligence models for flood and drought forecasting in arid and tropical regions
.
Environmental Modelling and Software
144
,
105136
.
https://doi.org/10.1016/j.envsoft.2021.105136
.
Afan
H. A.
,
El-shafie
A.
,
Mohtar
W. H. M. W.
&
Yaseen
Z. M.
2016
Past, present and prospect of an artificial intelligence (AI) based model for sediment transport prediction
.
Journal of Hydrology
541
,
902
913
.
https://doi.org/10.1016/j.jhydrol.2016.07.048
.
Alizadeh
B.
,
Ghaderi Bafti
A.
,
Kamangir
H.
,
Zhang
Y.
,
Wright
D. B.
&
Franz
K. J.
2021
A novel attention-based LSTM cell post-processor coupled with Bayesian optimization for streamflow prediction
.
Journal of Hydrology
601
(
February
),
126526
.
https://doi.org/10.1016/j.jhydrol.2021.126526
.
Althoff
D.
&
Rodrigues
L. N.
2021
Goodness-of-fit criteria for hydrological models: model calibration and performance assessment
.
Journal of Hydrology
600
(
December 2020
),
126674
.
https://doi.org/10.1016/j.jhydrol.2021.126674
.
Balaji
E.
,
Brindha
D.
,
Elumalai
V. K.
&
Vikrama
R.
2021
Automatic and non-invasive Parkinson's disease diagnosis and severity rating using LSTM network
.
Applied Soft Computing
108
,
107463
.
https://doi.org/10.1016/j.asoc.2021.107463
.
Bourdin
D. R.
,
Fleming
S. W.
&
Stull
R. B.
2012
Streamflow modelling: a primer on applications, approaches and challenges
. In:
Atmosphere – Ocean
, Vol.
50
, Issue
4
, pp.
507
536
.
https://doi.org/10.1080/07055900.2012.734276.
Cai
B.
&
Yu
Y.
2022
Flood forecasting in urban reservoir using hybrid recurrent neural network
.
Urban Climate
42
(
January
),
101086
.
https://doi.org/10.1016/j.uclim.2022.101086
.
Cai
J.
,
Hu
J.
,
Tang
X.
,
Hung
T. Y.
&
Tan
Y. P.
2020
Deep historical long short-term memory network for action recognition
.
Neurocomputing
407
,
428
438
.
https://doi.org/10.1016/j.neucom.2020.03.111
.
Chen
X.
,
Huang
J.
,
Han
Z.
,
Gao
H.
,
Liu
M.
&
Li
Z.
2020
The importance of short lag-time in the runoff forecasting model based on long short-term memory
.
Journal of Hydrology
589
(
July
),
125359
.
https://doi.org/10.1016/j.jhydrol.2020.125359
.
Chen
C.
,
Hui
Q.
,
Xie
W.
,
Wan
S.
,
Zhou
Y.
&
Pei
Q.
2021
Convolutional neural networks for forecasting flood process in internet-of-things enabled smart city
.
Computer Networks
186
(
December 2020
),
107744
.
https://doi.org/10.1016/j.comnet.2020.107744
.
Danandeh Mehr
A.
,
Ghadimi
S.
,
Marttila
H.
&
Torabi Haghighi
A.
2022
A new evolutionary time series model for streamflow forecasting in boreal lake-river systems
.
Theoretical and Applied Climatology
0123456789
.
https://doi.org/10.1007/s00704-022-03939-3
.
Daren Harmel
R.
&
Smith
P. K.
2007
Consideration of measurement uncertainty in the evaluation of goodness-of-fit in hydrologic and water quality modeling
.
Journal of Hydrology
337
(
3–4
),
326
336
.
https://doi.org/10.1016/j.jhydrol.2007.01.043
.
Fang
X.
&
Yuan
Z.
2019
Performance enhancing techniques for deep learning models in time series forecasting
.
Engineering Applications of Artificial Intelligence
85
(
May
),
533
542
.
https://doi.org/10.1016/j.engappai.2019.07.011
.
Feng
D.
,
Fang
K.
&
Shen
C.
2020
Enhancing streamflow forecast and extracting insights using long-short term memory networks with data integration at continental scales
.
Water Resources Research
56
(
9
),
1
45
.
https://doi.org/10.1029/2019WR026793
.
Feng
Z. k.
,
Shi
P. f.
,
Yang
T.
,
Niu
W. j.
,
Zhou
J. z.
&
Cheng
C. t.
2022
Parallel cooperation search algorithm and artificial intelligence method for streamflow time series forecasting
.
Journal of Hydrology
606
(
August 2021
),
127434
.
https://doi.org/10.1016/j.jhydrol.2022.127434
.
Greff
K.
,
Srivastava
R. K.
,
Koutnik
J.
,
Steunebrink
B. R.
&
Schmidhuber
J.
2017
LSTM: a search space odyssey
.
IEEE Transactions on Neural Networks and Learning Systems
28
(
10
),
2222
2232
.
https://doi.org/10.1109/TNNLS.2016.2582924
.
Guo
W. D.
,
Chen
W. B.
,
Yeh
S. H.
,
Chang
C. H.
&
Chen
H.
2021
Prediction of river stage using multistep-ahead machine learning techniques for a tidal river of Taiwan
.
Water (Switzerland)
13
(
7
).
https://doi.org/10.3390/w13070920
Hai Nguyen
D.
,
Hien Le
X.
,
Tran Anh
D.
,
Kim
S.-H.
&
Bae
D.-H.
2022
Hourly streamflow forecasting using a Bayesian additive regression tree model hybridized with a genetic algorithm
.
Journal of Hydrology
606
(
December 2021
),
127445
.
https://doi.org/10.1016/j.jhydrol.2022.127445
.
Han
P. F.
,
Wang
X. S.
,
Wan
L.
&
Kuang
X.
2022
Croplands decreased stability of streamflow with changing climate: an investigation of catchments in Illinois
.
Journal of Hydrology
606
(
29
),
127461
.
https://doi.org/10.1016/j.jhydrol.2022.127461
.
Hanna
W.
,
Wan
M.
,
Shazwani
N.
,
Abdullah
J.
,
Nizam
K.
&
Maulud
A.
2020
Urban flash flood index based on historical rainfall events
.
Sustainable Cities and Society
56
(
January
),
102088
.
https://doi.org/10.1016/j.scs.2020.102088
.
Hayder
G.
,
Iwan Solihin
M.
&
Najwa
M. R. N.
2022
Multi-step-ahead prediction of river flow using NARX neural networks and deep learning LSTM
.
H2Open Journal
5
(
1
),
42
59
.
https://doi.org/10.2166/h2oj.2022.134
.
Huang
X.
,
Li
Y.
,
Tian
Z.
,
Ye
Q.
,
Ke
Q.
,
Fan
D.
,
Mao
G.
,
Chen
A.
&
Liu
J.
2021
Evaluation of short-term streamflow prediction methods in Urban river basins
.
Physics and Chemistry of the Earth
123
(
April
),
103027
.
https://doi.org/10.1016/j.pce.2021.103027
.
Jackson
E. K.
,
Roberts
W.
,
Nelsen
B.
,
Williams
G. P.
,
Nelson
E. J.
&
Ames
D. P.
2019
Introductory overview: error metrics for hydrologic modelling – a review of common practices and an open source library to facilitate use and adoption
.
Environmental Modelling and Software
119
(
May
),
32
48
.
https://doi.org/10.1016/j.envsoft.2019.05.001
.
Jougla
R.
&
Leconte
R.
2022
Short-Term Hydrological Forecast Using Artificial Neural Network Models with Different Combinations and Spatial Representations of Hydrometeorological Inputs
.
Kaadoud
I. C.
,
Rougier
N. P.
&
Alexandre
F.
2022
Knowledge extraction from the learning of sequences in a long short term memory (LSTM) architecture
.
Knowledge-Based Systems
235
,
107657
.
https://doi.org/10.1016/j.knosys.2021.107657
.
Kao
I. F.
,
Liou
J. Y.
,
Lee
M. H.
&
Chang
F. J.
2021
Fusing stacked autoencoder and long short-term memory for regional multistep-ahead flood inundation forecasts
.
Journal of Hydrology
598
(
October 2020
),
126371
.
https://doi.org/10.1016/j.jhydrol.2021.126371
.
Kasiviswanathan
K. S.
,
He
J.
,
Sudheer
K. P.
&
Tay
J.
2016
Potential application of wavelet neural network ensemble to forecast streamflow for flood management
.
Journal of Hydrology
536
,
161
173
.
https://doi.org/10.1016/j.jhydrol.2016.02.044
Kilinc
H. C.
2022
Daily Streamflow Forecasting Based on the Hybrid Particle Swarm Optimization and Long Short-Term Memory Model in the Orontes Basin
.
Kilinc
H. C.
&
Haznedar
B.
2022
A hybrid model for streamflow forecasting in the basin of Euphrates
.
Water (Switzerland)
14
(
1
).
https://doi.org/10.3390/w14010080
Kim
T.
,
Yang
T.
,
Gao
S.
,
Zhang
L.
,
Ding
Z.
,
Wen
X.
,
Gourley
J. J.
&
Hong
Y.
2021
Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation ?: a case study of four watersheds with different hydro-climatic regions across the CONUS
.
Journal of Hydrology
598
(
May
),
126423
.
https://doi.org/10.1016/j.jhydrol.2021.126423
.
Kim
D.
,
Lee
J.
,
Kim
J.
,
Lee
M.
,
Wang
W.
&
Kim
H. S.
2022
Comparative analysis of long short-term memory and storage function model for flood water level forecasting of Bokha stream in NamHan River, Korea
.
Journal of Hydrology
606
(
July 2021
),
127415
.
https://doi.org/10.1016/j.jhydrol.2021.127415
.
Kim-soon
N.
,
Isah
N.
,
Ali
M.
&
Ahmad
A. R.
2016
Relationships Between Stormwater Management and Road Tunnel Maintenance Works, Flooding and Traffic Flow. July. https://doi.org/10.1166/asl.2016.7047
.
Kratzert
F.
,
Klotz
D.
,
Shalev
G.
,
Klambauer
G.
,
Hochreiter
S.
&
Nearing
G.
2019
Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets
.
Hydrology and Earth System Sciences
23
(
12
),
5089
5110
.
https://doi.org/10.5194/hess-23-5089-2019
.
Li
Y.
,
Zhu
Z.
,
Kong
D.
,
Han
H.
&
Zhao
Y.
2019
EA-LSTM: evolutionary attention-based LSTM for time series prediction
.
Knowledge-Based Systems
181
,
104785
.
https://doi.org/10.1016/j.knosys.2019.05.028
.
Li
Q.
,
Zhu
Y.
,
Shangguan
W.
,
Wang
X.
,
Li
L.
&
Yu
F.
2022
An attention-aware LSTM model for soil moisture and soil temperature prediction
.
Geoderma
409
(
December 2021
),
1
17
.
https://doi.org/10.1016/j.geoderma.2021.115651
.
Lin
Y.
,
Wang
D.
,
Wang
G.
,
Qiu
J.
,
Long
K.
,
Du
Y.
,
Xie
H.
,
Wei
Z.
,
Shangguan
W.
&
Dai
Y.
2021
A hybrid deep learning algorithm and its application to streamflow prediction
.
Journal of Hydrology
601
(
May
),
1
10
.
https://doi.org/10.1016/j.jhydrol.2021.126636
.
Liu
D.
2020
A rational performance criterion for hydrological model
.
Journal of Hydrology
590
(
September
),
125488
.
https://doi.org/10.1016/j.jhydrol.2020.125488
.
Liu
Y.
,
Hou
G.
,
Huang
F.
,
Qin
H.
,
Wang
B.
&
Yi
L.
2022
Directed graph deep neural network for multi-step daily streamflow forecasting
.
Journal of Hydrology
607
(
November 2021
),
127515
.
https://doi.org/10.1016/j.jhydrol.2022.127515
.
Lv
N.
,
Liang
X.
,
Chen
C.
,
Zhou
Y.
,
Li
J.
,
Wei
H.
&
Wang
H.
2020
A long short-term memory cyclic model with mutual information for hydrology forecasting: a case study in the Xixian basin
.
Advances in Water Resources
141
(
January
).
https://doi.org/10.1016/j.advwatres.2020.103622
Masrur Ahmed
A. A.
,
Deo
R. C.
,
Feng
Q.
,
Ghahramani
A.
,
Raj
N.
,
Yin
Z.
&
Yang
L.
2021
Deep learning hybrid model with Boruta-random forest optimiser algorithm for streamflow forecasting with climate mode indices, rainfall, and periodicity
.
Journal of Hydrology
599
(
April
),
126350
.
https://doi.org/10.1016/j.jhydrol.2021.126350
.
Mostaghimzadeh
E.
,
Adib
A.
,
Ashrafi
S. M.
&
Kisi
O.
2022
Investigation of a composite two-phase hedging rule policy for a multi reservoir system using streamflow forecast
.
Agricultural Water Management
265
(
August 2021
),
107542
.
https://doi.org/10.1016/j.agwat.2022.107542
.
Nanditha
J. S.
&
Mishra
V.
2021
On the need of ensemble flood forecast in India
.
Water Security
12
(
March
),
100086
.
https://doi.org/10.1016/j.wasec.2021.100086
.
Ni
L.
,
Wang
D.
,
Singh
V. P.
,
Wu
J.
,
Wang
Y.
&
Tao
Y.
2020
Streamflow and rainfall forecasting by two long short-term memory-based models
.
Journal of Hydrology
583
(
October 2019
),
124296
.
https://doi.org/10.1016/j.jhydrol.2019.124296
.
Othman
F.
,
Alaaeldin
M. E.
,
Seyam
M.
,
Ahmed
A. N.
,
Teo
F. Y.
,
Ming Fai
C.
,
Afan
H. A.
,
Sherif
M.
,
Sefelnasr
A.
&
El-Shafie
A.
2020
Efficient river water quality index prediction considering minimal number of inputs variables
.
Engineering Applications of Computational Fluid Mechanics
14
(
1
),
751
763
.
https://doi.org/10.1080/19942060.2020.1760942
.
Ouyang
W.
,
Lawson
K.
,
Feng
D.
,
Ye
L.
,
Zhang
C.
&
Shen
C.
2021
Continental-scale streamflow modeling of basins with reservoirs: towards a coherent deep-learning-based strategy
.
Journal of Hydrology
599
(
January
),
126455
.
https://doi.org/10.1016/j.jhydrol.2021.126455
.
Ritter
A.
&
Muñoz-carpena
R.
2013
Performance evaluation of hydrological models : statistical significance for reducing subjectivity in goodness-of-fit assessments
.
Journal of Hydrology
480
,
33
45
.
https://doi.org/10.1016/j.jhydrol.2012.12.004
.
Tang
S.
&
Yang
Y.
2021
Why neural networks apply to scientific computing?
Theoretical and Applied Mechanics Letters
11
(
3
),
100242
.
https://doi.org/10.1016/j.taml.2021.100242
.
Teng
J.
,
Jakeman
A. J.
,
Vaze
J.
,
Croke
B. F. W.
,
Dutta
D.
&
Kim
S.
2017
Flood inundation modelling: A review of methods, recent advances and uncertainty analysis
. In:
Environmental Modelling and Software
, Vol.
90
.
Elsevier Ltd
, pp.
201
216
.
https://doi.org/10.1016/j.envsoft.2017.01.006
Waseem
M.
,
Mani
N.
,
Andiego
G.
&
Usman
M.
2008
A review of criteria of fit for hydrological models
.
International Research Journal of Engineering and Technology
9001
,
1765
.
Available from: www.irjet.net
Watershed
S.
,
Wakatsuki
Y.
,
Nakane
H.
&
Hashino
T.
2022
River Stage Modeling with A Deep Neural Network Using Long-Term Rainfall Time Series as Input Data : Application to the Shimanto-River Watershed.
.
Xiang
Z.
,
Yan
J.
&
Demir
I.
2020
A rainfall-runoff model with LSTM-based sequence-to-sequence learning
.
Water Resources Research
56
(
1
).
https://doi.org/10.1029/2019WR025326
Yokoo
K.
,
Ishida
K.
,
Ercan
A.
,
Tu
T.
,
Nagasato
T.
,
Kiyama
M.
&
Amagasaki
M.
2022
Capabilities of deep learning models on learning physical relationships: case of rainfall-runoff modeling with LSTM
.
Science of the Total Environment
802
,
149876
.
https://doi.org/10.1016/j.scitotenv.2021.149876
.
Zabidi
H.
,
Henry
M.
&
Freitas
D.
2011
Re-evaluation of rock core logging for the prediction of preferred orientations of karst in the Kuala Lumpur Limestone Formation
.
Engineering Geology
117
(
3–4
),
159
169
.
https://doi.org/10.1016/j.enggeo.2010.10.006
.
Zhao
X.
,
Lv
H.
,
Lv
S.
,
Sang
Y.
,
Wei
Y.
&
Zhu
X.
2021
Enhancing robustness of monthly streamflow forecasting model using gated recurrent unit based on improved grey wolf optimizer
.
Journal of Hydrology
601
(
August 2020
),
126607
.
https://doi.org/10.1016/j.jhydrol.2021.126607
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).