## Abstract

In this study, rainfall–runoff (R–R) models were developed by assembling Particle Swarm Optimization (PSO) with the Feed Forward Neural Network (FFNN) and the Wavelet Neural Network (WNN). Performances of the model were compared with the wavelet ensembled neural network (WNN) and the conventional FFNN. The data from 1981 to 2005 were used for calibration and from 2006 to 2014 for validation of the models. Different combinations of rainfall and runoff were considered as inputs to the PSO–FFNN model. The fitness value and computational time of all the combinations were computed. Input combination was selected based on the lowest fitness value and lowest computational time. Four R–R models (FFNN, WNN, PSO–FFNN and PSO–WNN) were developed with the best input combination. The performance of the models was evaluated using statistical parameters (Nash–Sutcliffe Efficiency (*NSE*), *D* and root mean square error (*RMSE*)) and parameters vary in the range of (0.86–0.90), (0.95–0.97) and (68.87–84.37), respectively. After comparing the performance parameters and computational time of all four models, it was found that the PSO–FFNN model gave better values of *NSE* (0.89), *D* (0.97), *RMSE* (68.87) and less computational time (125.42 s) than other models. Thus, the PSO–FFNN model was better than the other three models (FFNN, WNN and PSO–WNN).

## HIGHLIGHTS

The goals of this research are selection of input combination based on fitness value and computational time.

Four R-R models (FFNN, WNN, PSO-FFNN and PSO-WNN) were developed with the best input combination. The performance of the models was evaluated using statistical parameters (NSE, D and RMSE).

## INTRODUCTION

The rainfall–runoff process is non-linear, spatially distributed and changing with time hence, it cannot be easily described by simple models. A considerable research effort around hydrology during the past few decades has been devoted toward the development of computer-based models of rainfall–runoff processes. A rainfall–runoff model is used to simulate the runoff of a catchment for a given rainfall as the input (Kumar *et al.* 2018; Kumar *et al.* 2020). The estimation of catchment runoff is required to assess flood peaks, assessment of water for agricultural or industrial purposes, municipal, irrigation, wildlife protection and many more (Satheeshkumar *et al.* 2017). Several R–R models (black box models, conceptual models and distributed models) have been developed over the years. The black box models are mainly focused on the transfer function which creates a connection between inputs and outputs without considering the relationship between them. Distributed models require large computer resources and a large amount of data for the successful implementation of the model as compared to lumped models (Brirhet & Benaabidate 2016). Conceptual models require a huge computation for calibrating the parameters involved.

Black box models have some advantages such as simple mathematics, the least computational requirements and satisfactory results. Neural network models can find a relationship between input samples and can group samples similarly to cluster analysis. Neural networks have been applied in many areas of water resources such as the development of the rainfall–runoff model, stream flow forecasting, ground water modeling, etc. Neural network models provide better results when compared with other conceptual SAC-SMA (Sacramento Soil Moisture Accounting) models (Hsu *et al.* 1995), autoregressive models (Raman & Sunilkumar 1995), ARMAX models (Fernando & Jayawardena 1998), multiple regression models (Thirumalaiah & Deo 2000), linear and non-linear regressive models (Elshorbagy *et al.* 2000) and conceptual models (Salas 1993). Asadnia *et al.* (2014); Kalteh (2008); Kumar *et al.* (2008); Nourani *et al.* (2014); Nourani *et al.* (2011); Solaimani (2009); Sudheer *et al.* (2002) used the neural network model for rainfall–runoff studies.

However, the conventional neural network model has many drawbacks such as the performance of the network depends on initial weights and the solution reaching global optimum is not assured. To overcome these limitations, it is essential to develop an efficient method to optimize the neural network. Optimization techniques have been successfully employed for overcoming the limitations of the neural network in recent investigations. Nourani *et al.* (2011) have used a hybrid wavelet genetic programming (WGP) approach to optimize neural networks and found that the results of hybrid models are more satisfactory. Daneshmand *et al.* (2014); De Paola *et al.* (2016); Heydari *et al.* (2016) have used optimization techniques to find satisfactory results. Swarm intelligence-based Particle Swarm Optimization (PSO) technique is used to optimize the neural network. The main reason is to develop a hybrid neural network to optimize the weight of the network in minimum computational time. There is no overlapping and mutation calculation in PSO. The PSO search can be done by the speed of the particle. During optimization, the most optimist particle transfers information to the other particle. The speed of re-searching the optimum value is very fast in PSO. Thus, this study shows the implementation of PSO with FFNN and WNN for rainfall–runoff modeling.

A few research works have been carried out in water resources using PSO (Jha *et al.* 2009; Khajeh *et al.* 2013). However, the authors could not find the application of PSO in rainfall–runoff modeling. Through this paper, the authors attempted to develop hybrid models with PSO to find the applicability of PSO in rainfall–runoff modeling. Hence, the main objective of this paper is to develop and compare the performance of hybrid models (PSO–FFNN, PSO–WNN and WNN) with the conventional neural network (FFNN). Furthermore, the fitness value and computational time of the four models have been computed and compared, which shows the efficiency of the developed models.

## STUDY AREA AND DATA USED

^{2}. In the Bagmati river basin, 45% of the total area falls in the Bihar region of India and the rest lies in Nepal. It passes nearly 195 km in Nepal and the remaining 394 km in Bihar. The average annual rainfall of the Bagmati river basin including Adhwara is 1,255 mm. The land is mainly utilized for horticultural and agricultural purposes in the study area. 20% area is under non-agricultural uses such as roads, railways, waterbodies, buildings, etc. No forest cover is available in the study area. The climatic condition of the basin is changing due to the intrinsic topography. Temperature generally decreases with altitude and becomes high in the summer and low in the winter (Shrestha & Sthapit 2016). Selected rain gauge stations (Benibad, Dheng Bridge and Kamtaul) of the Bagmati river basin in Bihar (India) are shown in Figure 1.

For this study, monthly rainfall data of 34 years, i.e., from years 1981 to 2014 at three rain gauge stations in the Bagmati river basin have been used. These data have been collected from Indian Meteorological Department (IMD), Pune. Monthly runoff data at Hayaghat gauging site from 1981 to 2014 have been collected from Central Water Commission (CWC), Patna. Average rainfall over the basin has been computed using the Thiessen polygon method from data from three rain gauge stations (Deheng, Kamtaul and Benibad).

## METHODOLOGY

### Selection of input combination

Four combinations are shown in Table 1. The selection of input combination was done based on computational time and fitness value using the PSO–FFNN model. The combination, which shows less computational time and fitness value, was selected for the development of the models.

. | |
---|---|

Combination 1 | Rainfall (t − 1) runoff (t − 1) rainfall (t) |

Combination 2 | Rainfall (t − 1) rainfall (t) |

Combination 3 | Rainfall (t − 2) runoff (t − 2) rainfall (t − 1) runoff (t − 1) rainfall (t) |

Combination 4 | Rainfall (t − 2) runoff (t − 2) rainfall (t − 1) rainfall (t) |

. | |
---|---|

Combination 1 | Rainfall (t − 1) runoff (t − 1) rainfall (t) |

Combination 2 | Rainfall (t − 1) rainfall (t) |

Combination 3 | Rainfall (t − 2) runoff (t − 2) rainfall (t − 1) runoff (t − 1) rainfall (t) |

Combination 4 | Rainfall (t − 2) runoff (t − 2) rainfall (t − 1) rainfall (t) |

### Development of models

Selected input combination is used to develop the four models (FFNN, WNN, PSO–FFNN and PSO–WNN). The monthly rainfall and runoff data have been divided into two parts. In the first part, 70% of the data (i.e., from 1981 to 2005) have been used for calibration and in the second part, i.e., the remaining 30% (i.e., from 2006 to 2014) have been used for validation models.

#### Development of a Feed Forward Neural Network

A Feed Forward Neural Network (FFNN) was used for the development of the FFNN model. The learning algorithm was adopted with a back propagation algorithm based on the generalized delta rule proposed by Rumelhart *et al.* (1986). In this algorithm, the connection weight between the nodes was adjusted by the strength of the signal in the connection and the total measure of the error.

The model error was computed by subtracting the model output from the observed output. This error was reduced by redistributing backward. The weights in the connection were adjusted in this step. The output was computed from the FFNN model using the adjusted weights. Back propagation was continued for a number of given cycles or until a prescribed error tolerance was reached. As mentioned by Dawson & Wilby (1998), different transfer functions and internal parameters were to be considered to make the network learning more generalized. The best fit structure of the ANN model was determined in the training process.

#### Development of a Wavelet Neural Network

*u*is the input wavelet signalwhere

*Ψ*(.) represents the mother wavelet,

*t*indicates the translation parameter,

*t*is the finite and

*λ*is the dilation parameter and dilation

*λ*> 0. The right side of Equation (4) is normalized so that for all

*λ*and t.

The selection of the mother wavelet depends on the signal to be analyzed. Mainly, Morlet and Daubechies wavelet transform will be used as the ‘Mother’ wavelet (Shoaib *et al.* 2014). Daubechies wavelet shows a good interconnection between parsimony and data abundance; it gives approximately similar events over the observed time sequence and shows up in various patterns that most forecast models cannot distinguish them well.

Input signals are decomposed by wavelet using Daubechies Discrete Wavelets one-dimensional, up to the second level. The decomposition level of the wavelet is selected by [log_{10}(*N*)] where, *N* is the total number of observation data. Input variables have been decomposed into detailed signals and approximate signals up to the second level. The Minmax threshold is used to denoise the decomposed signals. Then, these denoised signals are used for the WNN.

#### Particle Swarm Optimization

PSO is one of the new optimization techniques based on population. Each particle of PSO has its own velocity vector and position vector (Liu *et al.* 2008). A possible solution to a problem is represented by a velocity vector and a position vector. Position means rank is assigned to training data during calibration. The velocity vector is the execution time taken by the training data. Each particle stores its best global position obtained through interaction with its neighbor particles. The PSO algorithm manages its search by adjusting the position of particles and velocity vector. In PSO, the movement of particles is determined by the objective function. Particles nearer to the optimal solution have a lower velocity. Particles far from the optimal solution have a higher velocity.

Many researchers worked on optimization techniques such as Gravitational Search Algorithm, Cuckoo Search Algorithm, Krill Herd and PSO with Genetic Algorithm to solve problems such as aircraft landing, image processing, etc. (Cui *et al.* 2008; Liu *et al.* 2008; Agrawal & Kaur 2014; Tayarani *et al.* 2015; Chang *et al.* 2016; Girish 2016; Kakkar & Jain 2016; Wang *et al.* 2016).

*y*

_{i}is the output obtained from the models,

*d*

_{i}is the target value,

*q*is the number of patterns used for calibration and

*p*is the number of nodes in the output layer.

Both models are calibrated by lowering the fitness evaluation parameter in a search space. To obtain the fitness value, PSO finds the possible solution to a problem and measures its quality by using forward propagation through the model network.

PSO finds possible solutions. The quality of the possible solutions is measured by using forward propagation through the model network to obtain fitness evaluation values. The following steps are involved in PSO-based training algorithm:

First decide the structure of the network and parameters of PSO.

Initialize positions and velocities of a population. The position of everyone consists of network connection weights.

Based on a group of calibration data, calculate the fitness value of each particle using Equation (5). Initialize the individual best position and the global best position.

Update the position and the velocity for the whole particle swarm.

If the stopping condition of the algorithm is not satisfied, move to step 3. Otherwise, terminate the iteration and get the best-optimized weights from the global best solution.

#### PSO–FFNN and PSO–WNN networks

### Performance evaluation of models

The root mean square error (*RMSE*) is used to measure the differences between predicted values and observed values. A correlation coefficient value lies between −1 and +1, which quantifies the type of correlation and dependence, by developing meaningful statistical relationships between two or more variables in fundamental statistics. Nash–Sutcliffe Efficiency (*NSE*) is used to quantify how well a model simulation can predict the outcome variable and it shows the predictive power of any model. *NSE* is varying from −∞ to 1. *NSE* = 0 indicates that the model predictions are as accurate as the mean of the observed data, −∞ < *NSE* < 0 indicates that the observed mean is a better predictor than the model. *NSE* = 1 corresponds to a perfect match of the model to the observed data.

## RESULTS AND DISCUSSION

### Selection of input combination

Computational time was one of the most important parameters in developing soft computing models. The computational time of combination 4 was lower, 135.5 s, as compared to the other three combinations. As a result, combination 4 was most suited for the development of models.

### Calibration and validation of models

The selected combination has been used for the development of four models (FFNN, WNN, PSO–FFNN and PSO–WNN) in the Bagmati river basin. The monthly rainfall and runoff of the Bagmati river basin from 1981 to 2014 have been used for calibration and validation of the models. All four models have been calibrated using data from 1981 to 2005 and validated from 2006 to 2014. The selected combination was trained with error tolerance, the number of cycles for learning, the learning parameter and the neurons in the hidden layer as 0.01, 1,000, 0.1 and 1, respectively. The output values from the network were denormalized and compared with the observed targeted values. The performance criteria such as *NSE, RMSE* and *D* were used to examine the performance of the model. In the training of the above combination, the number of cycles was increased in steps up to 2,000 and it was found that the convergence was not static for this number of cycles. For this initial combination, the *RMSE* of targeted and expected values was very high and the coefficient of correlation was low. Then, the network was trained with the increase or decrease values of error tolerance and varied values of the learning parameter. The learning parameter and the error tolerance were fixed with low *RMSE* and a high coefficient of correlation. The neurons in the hidden layer were increased from the minimum to the number high value of the coefficient of correlation. The number of neurons, which gave the highest *NSE* value, was selected for this combination. It was observed that convergence for this combination was achieved with the error tolerance, the learning parameter and the number of cycles and neurons in the hidden layer as 0.1, 0.9, 1,000 and 5, respectively. The *RMSE* was 68.87 and *NSE* was 0.90 in the PSO–FFNN model. To get the optimized structure of the hybrid model network, input combination 4 was used for calibration and validation of the models. It was noticed that the best convergence was achieved for the above combination by optimizing the weights of the network and internal parameters (error tolerance, learning parameter, number of hidden layer etc.) of the network.

### Performance analysis of the models developed

*RMSE*were 0.95 and 68.87 during calibration and 0.95 and 69.15 during validation, respectively. The computed statistical parameters of all the models have been presented in Table 2.

Models/Statistical parameter . | Calibration . | Validation . | ||||||
---|---|---|---|---|---|---|---|---|

FFNN . | WNN . | PSO–FFNN . | PSO–WNN . | FFNN . | WNN . | PSO–FFNN . | PSO–WNN . | |

RMSE | 80.22 | 75.92 | 68.87 | 73.71 | 84.35 | 77.21 | 69.15 | 74.55 |

NSE | 0.87 | 0.88 | 0.90 | 0.89 | 0.86 | 0.87 | 0.89 | 0.88 |

D | 0.96 | 0.96 | 0.97 | 0.97 | 0.95 | 0.95 | 0.97 | 0.96 |

R | 0.93 | 0.94 | 0.95 | 0.94 | 0.92 | 0.93 | 0.94 | 0.93 |

R ^{2} | 0.87 | 0.89 | 0.90 | 0.89 | 0.86 | 0.88 | 0.89 | 0.88 |

Models/Statistical parameter . | Calibration . | Validation . | ||||||
---|---|---|---|---|---|---|---|---|

FFNN . | WNN . | PSO–FFNN . | PSO–WNN . | FFNN . | WNN . | PSO–FFNN . | PSO–WNN . | |

RMSE | 80.22 | 75.92 | 68.87 | 73.71 | 84.35 | 77.21 | 69.15 | 74.55 |

NSE | 0.87 | 0.88 | 0.90 | 0.89 | 0.86 | 0.87 | 0.89 | 0.88 |

D | 0.96 | 0.96 | 0.97 | 0.97 | 0.95 | 0.95 | 0.97 | 0.96 |

R | 0.93 | 0.94 | 0.95 | 0.94 | 0.92 | 0.93 | 0.94 | 0.93 |

R ^{2} | 0.87 | 0.89 | 0.90 | 0.89 | 0.86 | 0.88 | 0.89 | 0.88 |

*RMSE*,

*NSE*,

*R, D*and

*R*

^{2}of PSO–FFNN are 68.8, 0.9, 0.97, 0.95 and 0.9, respectively, and 69.15, 0.89, 0.97, 0.94 and 0.89, respectively, during the validation of models. Performance parameters indicate that the PSO–FFNN model shows better results as compared to the other three models (FFNN, WNN and PSO–WNN). Optimization of error of all four models has been presented in Figure 6. The PSO–FFNN model gives lower error as compared to the other three models. The computation time of four models has also been calculated and presented in Figure 7. The computation time of PSO–FFNN, PSO–WNN, WNN and FFNN was 125.42, 135.92, 165.24 and 181.28 s, respectively. As a result, the PSO–FFNN model has less computation time as compared to the other three models. Based on these results, it is evident that the PSO–FFNN model is better than other models developed.

## DISCUSSION

Four models have been developed, i.e., FFNN, WNN, PSO–FFNN and PSO–WNN. It was found that the performance parameters, i.e., *RMSE*, *NSE*, *R, D* and *R*^{2} of PSO–FFNN were 68.87, 0.9, 0.97, 0.95 and 0.9, respectively, during the calibration of models and 69.15, 0.89, 0.97, 0.943 and 0.895, respectively, during the validation of models. Performance parameters of PSO–FFNN models show a 4–12% decrease in *RMSE*, a 1–2% increase in *NSE*, *D, R* and *R*^{2} which shows the better results among the other three developed models (FFNN, WNN and PSO–WNN). These results are in accordance with the results of Mazandaranizadeh & Motahari (2017).

## CONCLUSIONS

In this paper, the application of PSO with a neural network to develop rainfall–runoff modeling of the Bagmati river basin was studied. Four models (FFNN, WNN, PSO–FFNN and PSO–WNN) are developed. The monthly rainfall and runoff from 1981 to 2014 of the Bagmati river basin were selected for the development of the models. Three rain gauge stations (Benibad, Dheng bridge and Kamtaul) were selected and the average rainfall was calculated using the Thiessen polygon method. The data from 1981 to 2005 were used for calibration of the models and from 2006 to 2014 for validation of the developed models. The selection of input combination was done using PSO–FFNN. Fitness value and computational time were computed for all combinations. Combination 4 (rainfall (*t*–2) runoff (*t*–2) rainfall (*t*–1) rainfall (*t*)) shows less fitness value, i.e., 4,742 and less computational time, i.e., 135.53 s as compared with other three inputs combination. Input combination 4 was selected for the development of the models. Performance evaluation of the developed models was carried out using statistical performance parameters (*NSE, D*, *RMSE*). The performance parameters such as *NSE, D* and *RMSE* of all developed models vary in the range of (0.86–0.90), (0.95–0.97) and (68.87–84.37), respectively. The *NSE* values of FFNN, WNN, PSO–FFNN and PSO–WNN are 0.87, 0.88, 0.9 and 0.89, respectively, during calibration and 0.86, 0.87, 0.89 and 0.88, respectively, during validation of the models. PSO–FFNN models show better results as compared to the other models (FFNN, WNN and PSO–WNN). Computation times of the different developed models are also calculated and compared in this study. The computational time taken by PSO–FFNN, PSO–WNN, WNN and FFNN was 125.42, 135.92, 165.24 and 181.28 s, respectively. As a result, the PSO–FFNN model has lesser computation time (i.e., 125.42 s) as compared to the other three models. Finally, it was found that the PSO–FFNN model performs better than other models (FFNN, WNN and PSO–WNN).

## ACKNOWLEDGEMENTS

The authors would like to acknowledge the IMD (Indian Meteorological Department) and CWC (Central Water Commission) for providing the data for analysis.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICT OF INTEREST

The authors declare there is no conflict.