Accurateness in flood prediction is of utmost significance for mitigating catastrophes caused by flood events. Flooding leads to severe civic and financial damage, particularly in large river basins, and mainly affects the downstream regions of a river bed. Artificial Intelligence (AI) models have been effectively utilized as a tool for modelling numerous nonlinear relationships and is suitable to model complex hydrological systems. Therefore, the main purpose of this research is to propose an effective hybrid system by integrating an Adaptive Neuro-Fuzzy Inference System (ANFIS) model with meta-heuristic Grey Wolf Optimization (GWO) and Grasshopper Optimization Algorithm (GOA) for flood prediction in River Mahanadi, India. Robustness of proposed meta-heurestics are assessed by comparing with a conventional ANFIS model focusing on various input combinations considering 50 years of monthly historical flood discharge data. The potential of the AI models is evaluated and compared with observed data in both training and validation sets based on three statistical performance evaluation factors, namely root mean squared error (RMSE), mean squared error (MSE) and Wilmott Index (WI). Results reveal that robust ANFIS-GOA outperforms standalone AI techniques and can make superior flood forecasting for all input scenarios.
A novel insight on prediction of flood flow is developed by hybridizing ANFIS-GOA.
Different input combinations of flood causative factors are analysed.
A comprehensive assessment and comparative analysis have been carried out.
Integrated artificial intelligence with GOA outperforms the standard AI methods.
ANFIS-GOA model exhibits a superior reliable model and improves the predictive precision of flood events.
Forecasting various hydrological phenomena is of significant concern in the field of hydrology and is pivotal for appropriate water resources development and disaster management. Every year, substantial public and financial damages, as well as fatalities, are caused by dangerous storms worldwide, specifically in areas subjected to monsoon weather and regions with slow growth of water conservancy schemes (Jiang et al. 2013; Wang et al. 2015; Yu et al. 2015). Flood prediction and forecasting act as the essential practices to control flood events across the globe (Young 2002; Campolo et al. 2003; Moore et al. 2005; Jiang et al. 2016; Rath et al. 2017; Panigrahi et al. 2018; Samantaray & Sahoo 2020). Evidence of this information is a complex investigation that has concerned researchers over past decades. Upstream circumstances intensely influence the flood flows in downstream zones; therefore, a flood forecasting model needs to be developed which can detect accurately eloquent fundamental connection amid downstream and upstream situations. Artificial neural networks (ANNs) act as suitable models for the problem mentioned above. Over past decades, ANN and Adaptive Neuro-Fuzzy Inference System (ANFIS) models have been comprehensively utilized in a variety of engineering applications involving hydrology such as simulation of rainfall-runoff process (Wu & Chau 2011), model groundwater problems (Sahoo et al. 2005; Taormina et al. 2012), forecast streamflow (El-Shafie et al. 2007; Shu & Ouarda 2008) and modelling water quality (Singh et al. 2009; Yan et al. 2010). In the last decade, numerous developments have been made to improve both the enactment and consistency of ANN tools. In recent times, attention has been shifted from focusing on the applicability of ANN tools to importance on refining estimation capability and clarifying the inner conduct of ANN tools (Maier & Dandy 2000; Sudheer & Jain 2004; Araghinejad 2013).
Chen et al. (2006) proposed the construction of a flood forecast model using ANFIS in the Choshui River, Taiwan, and compared its performance with a back-propagation neural network (BPNN). Obtained results demonstrated that ANFIS was effective and reliable to construct a flood forecasting model with better accuracy. Bisht & Jangid (2011) used ANFIS to develop river stage-discharge models at the Dhawalaishwaram barrage site in Andhra Pradesh, India. Based on the comparison of observed and estimated data, outcomes revealed that ANFIS performed better in predicting river flow discharge compared to customary models. Rezaeianzadeh et al. (2014) used ANFIS, ANN, multiple linear regression, and multiple nonlinear regression to forecast the peak flow of Khosrow Shirin catchment, positioned in the Fars region, Iran, on a daily basis. Predictive capabilities of the proposed model were evaluated and it was observed that ANFIS performed superiorly for predicting daily flow discharge at the proposed site with spatially distributed rainfall as input. Ghorbani et al. (2016) investigated the usability of two diverse ANNs, which include multilayer perceptron (MLP) and RBFN, and a comparison is made with support vector machine (SVM) for predicting streamflow in Zarrinehrud River, Iran on a monthly basis. Results indicated that the SVM model was more certain and consistent than MLP and RBFN in river flow prediction. Zhou et al. (2019) proposed a recurrent ANFIS entrenched with GA and least square estimator that helped in optimizing model constraints to make multi-step-ahead flood forecasts of Three Gorges Reservoir, China. The results demonstrated that the proposed model significantly improved the accuracy of flood forecasts.
Selection of model input (for example, pre-processing of data utilizing data-mining methods), model parameter optimization and post-processing of model output (for example, real time correction, ensemble forecasts) are key motivations and significant constituents in multi-step ahead hydrological forecasts. Several climatological parameters have a significant impact on the performance of models while dealing with multi-step ahead forecasts in application to real world problems. While making multi-step ahead flood forecasts, models with different problems (e.g. model instability/overfitting) will fail in tracing flow traces carefully, particularly during peak flows, because of an increase in forecast horizon. Therefore, an effective algorithm is necessary to determine an optimum network parameter setting for improving the reliability and stability of forecasting models.
Optimization algorithms such as differential evolution (DE), particle swarm optimization (PSO), genetic algorithm (GA) and GWO have been developed and integrated with data-driven models for forecasting various water resources and environmental problems (Senapati et al. 2007; Guo et al. 2014; Zhang et al. 2014; Prasad et al. 2017; Yaseen et al. 2017; Ewees & Elaziz 2018; Dehghani et al. 2019a, 2019b; Tikhamarine et al. 2020). Mirjalili et al. (2014) introduced a GWO algorithm to optimize an MLP network showing superior performance compared to GA, PSO, Evolution Strategy (EA) and Ant Colony Optimization (ACO). Tikhamarine et al. (2020) proposed different efficient hybrid neural network models combining GWO with ANN, SVM, and MLR to improve precision and ability in forecasting streamflow on a monthly basis at Aswan High Dam in the Nile River. The results revealed that integrated techniques used in their study outperformed standard ANN, SVM, and MLR techniques and made improved forecasts throughout training and testing periods for monthly inflow. However, GWO has a few disadvantages such as low solving accuracy, unsatisfactory ability of local searching, and slow convergence rate. In recent times, a new meta-heuristic nature inspired optimization algorithm called GOA was introduced by Saremi et al. (2017). GOA is based on the swarming behaviour of grasshoppers. It is utilized to improve ANFIS model performance, which has previously not been explored for flood prediction. This algorithm is classified as a multi-solution algorithm in optimization problems having higher accurateness and avoids local optima. It has proved to be an influential algorithm in challenging problems which deal with unidentified search spaces (Saremi et al. 2017; Mirjalili et al. 2018). There are several applications of GOA integrated with AI models in researches involving various fields of science and engineering: selecting harmonic elimination in low-frequency voltage source inverter (Steczek et al. 2020); approximate flyrock distance in mine blasting (Fattahi & Hasanipanah 2021); estimating the parameter of photovoltaic modules on the basis of single diode models (Montano et al. 2020); neural assessment of heating load (HL) of residential buildings (Moayedi et al. 2019); optimal deployment of wireless sensor networks (Deghbouch & Debbat 2021); prediction of pipe burst in urban water distribution systems (Alizadeh et al. 2019) and many more. GOA also has specific application in modelling various hydrological parameters such as evaluation of rainfall temporal variability (Farrokhi et al. 2020); monthly prediction of groundwater level (Seifi et al. 2020); forecasting short-term hydrological drought (Nabipour et al. 2020) and optimization of the non-linear muskingum flood routing model (Khalifeh et al. 2020).
The present research utilizes robust ANFIS-GWO and ANFIS-GOA models for flood prediction in the Mahanadi river basin, India, and outcomes achieved are assessed with convential ANFIS and ANN models. Based on a literature survey, it is observed that no research has been carried out for predicting flood events using the robust ANFIS-GOA technique. The novelty of this research is the application of ANFIS-GOA in flood prediction, this has been carried out by the authors. Also, this study aims for the sensitivity analysis for three different artificial intelligence tools to forecast monthly flood water levels. In the present research, special attention is paid to modelling parameter optimization. This research also places emphasis on various input combinations for different scenarios which has a strong impact on the desired model output.
The Mahanadi River (Figure 1) flows in central India, rising in the hills of the southeastern state of Chhattisgarh, and mainly flowing through Odisha state. Mahanadi has a total course of 858 km (494 km in Odisha) and has an estimated drainage area of 141,600 km2 (65,580 km2 in Odisha), which is about 42% of the Odisha state. Mahanadi lies within 20.11°N 81.91°E coordinates. It is known for its devastating floods, causing much misery to life and property, as recorded in history. It is a significant river for the state of Odisha. It originates from the south of Sihawa town in Dhamtari district of Chhattisgarh and finally discharges to the Bay of Bengal at False Point of Jagatsinghpur, Odisha. In the present research, Jondhra and Kesinga gauge stations of Mahanadi river basin are selected for predicting flood events.
Mirjalili et al. (2014) proposed GWO algorithm mimicking social behaviour and hierarchy of grey wolves. GWO is a novel meta-heuristic optimization algorithm. In general, the wolves pack is distributed into four categories: Alpha , Beta , Delta and Omega . Alpha wolf is the most dominant wolf and is the leader of the wolves pack. The level of domination goes on decreasing from to as presented in Figure 3. The mechanism involved in GWO is carried out by splitting a solution set into four groups for a specified optimization problem. and wolves are the first three solutions, whereas residual solutions fall in the group of ω wolves. For implementing this mechanism, the hierarchical step is updated in each iteration on the basis of three optimal solutions. A representation of the updated position is demonstrated in Figure 4. The significant approach involved in GWO is to search, encircle, hunt, and finally attack the prey.
The encircling and attacking of prey repeatedly continues until an optimal solution is achieved or it reaches maximum iterations.
Pseudo-code of GOA
Initialize the swarm Xi (i = 1, 2, 3,….,n)
Initialize cmax, cmin, and maximum number of iteration
Calculate the fitness of each search agent
T = the best search agent
while (1 < maximum number of iteration)
Update using Eq. (18)
for each search agent
Normalize the distance between grasshoppers in [1, 4]
Update the position of the current search agent by the Equation (17)
Bring the current search agent back if it goes outside the boundaries
Update T if there is a better solution
1 = 1 + 1
Proposed hybrid methodology
In the present study, GOA and GWO algorithms were applied for evaluating optimum values and training ANFIS. For developing ANFIS-GOA and ANFIS-GWO, optimum GOA and GWO parameters can be set on the basis of abundant parametric studies. For creating ANFIS-GOA and ANFIS-GWO, two codes were generated in MATLAB. In ANFIS-GOA and ANFIS-GWO, GOA and GWO helps hybrid approaches to possess a closer relationship concerning input and output. Established robust techniques can estimate more precise outcomes for nonlinear problems. Major work in a hybrid modelling approach is the appropriate selection of GOA and GWO parameters. For evaluating the best values of GOA and GWO parameters, a trial and error approach was employed and optimum values were obtained. GOA parameters are given in Table 1. Also, Figures 4 and 6 shows the training process of ANFIS by GOA and GWO algorithms.
|Parameter .||Values .|
|Maximum iterations number||1,000|
|Number of search agents||50|
|1 × 10–7|
|Attraction longitude scale||1.5|
|Parameter .||Values .|
|Maximum iterations number||1,000|
|Number of search agents||50|
|1 × 10–7|
|Attraction longitude scale||1.5|
Preparation of data set
Precipitation (P) data was obtained from CWC, Bhubaneswar, whereas temperature (T), solar radiation (Sr), humidity (H), evapotranspiration loss (El), absorption loss (Al) and percolation loss (Pl) data were collected from IMD, Pune, for the period of 1970–2019. Data from 1970 to 2004 were used for training, and 2005–2019 for testing purposes. Required monthly data were converted from daily data which is necessary in the training and testing model. The following arrangements were applied as input:
Scenario 1: P, T, Sr, H
Scenario 2: P, T, Sr, H, El
Scenario 3: P, T, Sr, H, El, Al
Scenario 4: P, T, Sr, H, El, Al, Pl
RESULTS AND DISCUSSION
A comparative performance of four scenarios through ANFIS are assessed in this study at Jondhra, Kesinga, respectively (Supplementary Material, Tables S1 and S2). Five different transfer functions are used for ANFIS for finding the best model that can proficiently predict flood in the proposed study area. It can be observed from the results that all membership funtions produced reasonable outcomes for flood prediction. For a fair comparison, various models and parameters affecting flood prediction are taken for both cases. The results show that the Gbell function gives the best value of Wilmott index for all four scenarios. Among the four scenarios, model four found the prominent value of WI 0.94453 and 0.94856 (for testing), and 0.96097 and 0.96474 (for training) phases at Jondhra and Kesinga gauge station.
The ANFIS-GWO results are assessed in Supplementary Material, Tables S3 and S4 for Jondhra, Kesinga station. For computation of performance Tri, Trap, Gbell, Gauss, Pi functions are taken into consideration. The Gbell function shows the best model which possesses MSE training 0.00326, MSE testing 0.00175, RMSE training 0.04997, RMSE testing 0.01012 and WI training 0.98139, testing 0.96001 in Scenario IV at Jondhra station. Accordingly for Scenarios I, II, and III, the best values of WI are 0.95288, 0.95589, and 0.95803 in the testing phases, respectively. Similarly, at Kesinga gauge station the best values of WI are 0.95406, 0.95678, 0.95889, and 0.96208 for Scenarios I, II, III, IV in the testing phases.
Correspondingly, a hybird ANFIS-GOA algorithim is proposed for model development at Jondhra, Kesinga station (Supplementary Material, Tables S5 and S6). The results reaveal that the best values of WI are 0.99158, 0.98993, 0.98806, 0.98638 during the training phase for Scenarios I, II, III, IV at Jondhra basin, respectively. Similary, for Kesinga basin the prominent values of WI are 0.99263, 0.99001, 0.98814, 0.98654 while considering Scenarios I, II, III, IV as input criteria, respectively. For both stations Scenario IV showed a prominent performance over the other three scenarios.
The above results show that when evaotranspiraion loss is added to normal climatology indices (Scenario II) it impacts more accuraracy than Scenario I. Again, when we consider absorption loss (Scenario III), we get better results than Scenario II. The overall results show that Scenario IV is more prominent than the other three scenarios. Senario IV is different from Scenario III in terms of percolation loss. Therefore, losses are the key constraint toward flood forecasting. Similarly, on the other hand, the new hybrid model (ANFIS-GOA) showed better performance than ANFIS, ANFIS-GWO model. However, the obtained results showed that ANFIS-GOA perfomed best among other models. It has a lower coefficient of variation compared to other models and optimization algorithms. The major contribution of this research is the assessement of flood prediction potential by hybrid models based on parametric effects owing to regulatory weights and parameters formed in the training phase of models.
Assessment of outcome for different models
Appraisals of ANFIS, ANFIS-GWO and ANFIS-GOA model at training and testing periods for all proposed gauge stations are presented in Figures 7 and 8. The paramount value of WI for ANFIS, ANFIS-GWO and ANFIS-GOA models was 0.96097, 0.98139, and 0.99158, respectively, for Jondhara station. Similarly, for Kesinga station the preeminent value of WI was 0.96474, 0.98216, and 0.99263 for ANFIS, ANFIS-GWO and ANFIS-GOA models, respectively. A detailed graphical representation of actual versus predicted flood with respect to WI for different gauge stations in Figure 7.
Assessment of actual flood versus simulated flood at Jondhara and Kesinga
Figures 9 and 10 reveal the variation of actual versus simulated flood. Outcomes reveal that projected peak floods are 5,241, 5,372 and 5,442 m3/s for ANFIS, ANFIS-GWO and ANFIS-GOA, respectively, against the observed peak of 5,568 m3/s for Jondhra gauge site. For Kesinga, projected peak floods are 5,036, 5,145 and 5,199 m3/s for ANFIS, ANFIS-GWO and ANFIS-GOA, respectively, contrary to the observed peak of 5,308 m3/s, as presented in Figure 10. This indicates a significant impact on flood and was found to be beneficial for flash flood regions with a predictive flood index.
Figure 11 shows a box-plot of observed and simulated flood values from 1970 to 2019 in Kesinga and Jondhra gauge stations. Assessment of ANFIS, ANFIS-GWO and ANFIS-GOA techniques with observed values reveals that the ANFIS-GOA method can make high flood estimations. Moreover, for ANFIS the box area is minimum which approves lower accurateness of ANFIS in comparison to other models.
A histogram showing the ratio of simulated and observed flood values for ANFIS, ANFIS-GWO and ANFIS-GOA models have been presented for assessing the frequency of datum points in a number of selected error bins. Here, the total number of months binned on the x-axis has been analysed where the probability of occurrence for any given time series has been checked. A close investigation of simulated and observed flood by ANFIS-GOA and its relative models is shown in Figures 12 and 13. These signify probability distribution of data. It is significant to understand that these plots are very necessary for representing probability occurrence of a specified flood value inside a particular interval. Based on this amount of model precision, it is apparent that probability distribution of predicted flood values by the ANFIS-GOA model were very near to the observed flood values for most intervals as presented in Figures 12 and 13.
Comparison of model performance
MSE, RMSE and WI indictors are used for evaluating the performance of ANFIS, ANFIS-GWO and ANFIS-GOA models for two gauge basins. An assessment of the performance indicators is listed in Table 2, illustrating the efficiency of each model. Evaluating flood is very significant and hence proposed the methods applied in the present study are important to demonstrate flood prediction information. Therefore, calculation of RMSE, WI, and MSE values is vital to predict flood. It is apparent that the ANFIS-GOA model executed well compared to ANFIS and ANFIS-GWO for four scenarios. Evaluation and assessment are conducted for studying the performance of the models.
|Gauge station .||Techniques .||MSE|
|Training .||Testing .||Training .||Testing .||Training .||Testing .|
|Gauge station .||Techniques .||MSE|
|Training .||Testing .||Training .||Testing .||Training .||Testing .|
The obtained results clearly show the advantages of GOA to solve real-world problems with unidentified search space. The success of GOA is due to many reasons. In the preliminary steps of optimization, the exploration capability of GOA is high because of the huge repulsion rate amid grasshoppers. This helps GOA in broadly exploring search space and discovering its favorable areas. Then, in the final steps of optimization, exploitation is high because of higher attraction forces amid grasshoppers. This behaviour causes a local search and improves the accurateness of solutions found in the exploration stage. GOA efficiently balances exploitation and exploration, primarily focusing on avoiding local optima, and then conjunction. Suggestion of an adaptive comfort zone coefficient is the reason behind this behaviour. Steady declining of this constituent carries grasshoppers nearer to target which is proportional to the number of iterations. Finally, the suggested target chasing mechanism necessitates GOA for saving the best solution attained so far as target, and driving grasshoppers towards it with a hope to improve its accurateness or find a superior one in the search space. In view of simulation, result, discussions, and analysis of this research, we believe that GOA is capable of solving several optimization problems efficiently. It considers a specified optimization problem as a black box, thus it does not require gradient information of search space. Hence, it can be employed to any optimization problem in various fields conditional on appropriate problem formulation.
The above results indicated that performance of ANFIS-GOA in terms of RMSE and WI value is prominent compared to hybrid ANN with KNN (Kan et al. 2020); ANFIS-GA, ANFIS-PSO, ANFIS-ACO (Azad et al. 2018); hybrid deep learning ConvLSTM (Moishin et al. 2021); FSF-ARIMA (Banihabib et al. 2020) and ANFIS (Rezaeianzadeh et al. 2014).
In this research, the sensitivity of precipitation (P), temperature (T), solar radiation (Sr), humidity (H), evapotranspiration loss (El), absorption loss (Al), percolation loss (Pl) and constraints toward flood prediction through different machine learning approaches are discussed. At first, we developed a model using precipitation, temperature, solar radiation, humidity parameters and found the model efficacy at two proposed stations. Second, evapotranspiration loss was included with the previous model and found better results than Scenario 1. Similarly, inclusion of absorption loss (Al) with Scenario 2 for all techniques found more performance value than Scenario 2. While we considered percolation loss (Pl) as an input constraint (Scenario 4) with the previous arrangement (Scenario 3) for model development, it gives preeminent value performance for all five machine learning approaches during both training and testing phases. Moreover, we found that all three losses (evapotranspiration loss, absorption loss, percolation loss) possess a sensitive effect towards flood prediction for all five proposed machine learning algorithims. Also, it was found that the ANFIS-GOA model gives more sensitive performance value than other proposed machine learning approaches. As the proposed study area is within a flood prone region, development of ANN models will aid in assessing flood discharge. These results suggest the most appropriate methods for developing environmental concerns for estimating flood in the stations of any flood region. However, a combination of techniques needs to be examined for improving the conjoint modelling techniques for the future.
Limitations and future scope
A major disadvantage of applying hybrid machine learning algorithms is that the training time of the model increases after hybridization of machine learning and meta-heuristic algorithms, particularly while dealing with certain complex problems. Also, they are classifier specific methods and depend on a combination of different feature selection methods. Taking into consideration the advantages and disadvantages of various algorithms, conjoining search strategies of different algorithms for generating a novel algorithm is a burning research matter. The prediction performances of data-driven models are subject to quality and quantity of data. This study is conducted in a specific location (Jondhra and Kesinga gauge stations of Mahanadi river basin). The scope of the present study can be extended by applying ML models to various other geographical locations. Selection of best input combinations for a particular model can possibly vary with changes in default model operators. This assessment of selecting best input combinations using different approaches could be an interesting subject for future research. Moreover, in the direction of future research, it is significant to mention that not all rules in model architecture are vital; hence, it is essential to reduce the complexity of trained models by removing non-contributing rules leading to a decrease in the computational cost of the network. To improve proposed methods more state-of-the-art AI methods, for example the probabilistic and ensemble forecasting methods, could be combined with data-driven models for reducing uncertainties in multistep ahead flood forecast in future research.
This study investigates the potential of new hybrid models combining GOA and GWO algorithms with the ANFIS (ANFIS-GOA and ANFIS-GWO) model for prediction of flood events. To achieve this objective, Jondhra and Kesinga stations located on the Mahanadi River were chosen as the case study. GOA and GWO were developed to optimize the parameters of ANFIS and were then compared with simple ANN and ANFIS models. The outcomes of ANFIS, ANFIS-GWO and ANFIS-GOA models was compared and assessed on the basis of their training and testing performances based on performance indicators.
The results revealed that integration of AI models with optimization algorithms proved to be more effective and accurate compared to customary AI.
Comparing the results of ANFIS, ANFIS-GWO and ANFIS-GOA models, it could be inferred that the values of MSE and RMSE of ANFIS-GOA models were less than those of ANFIS-GWO and AI models. However, the WI value of the ANFIS-GOA model has higher sensitivity compared to other applied models. Hence, it was observed that ANFIS-GOA and ANFIS-GWO models improved prediction accurateness of the conventional ANFIS model with the ANFIS-GOA model performing slightly better than ANFIS-GWO for all imput combinations.
It appears that application of a meta-heuristic optimization algorithm in the training period of the ANFIS technique can decrease the flaw of conventional optimization algorithms and lead to more consistent outcomes. Also, it was proved that climatic losses sensitively affect flood prediction.
Additionally, in certain case studies, it is essential to develop a real-time forecasting model, and therefore, time required to execute the model should be less. Hence, application of more newly developed search algorithms is necessary for achieving a fast convergence process as well.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.