GRNN Model for prediction of groundwater ﬂ uctuation in the state of Uttarakhand of India using GRACE data under limited bore well data

Springs, the primary source of water in the Indian state of Uttarakhand, are disappearing day by day. A report published by United Nations Development Program in 2015 indicates that due to deforestation, and forest ﬁ re, the groundwater of the state has been reduced by 50% between 2007 and 2010. As such, for taking proper adaptation policies for the state, it is necessary to monitor the state ’ s groundwater ﬂ uctuation. Unfortunately, the bore well data are very limited. Thus, we are proposing two general regression neural network (GRNN)-based models for fast estimation of groundwater ﬂ uctuation. The ﬁ rst model evaluates and predicts the groundwater ﬂ uctuation in the ﬁ ve known bore well data districts of the state, and the second model, which is based on the ﬁ rst model along with a correlation matrix, predicts the groundwater ﬂ uctuation in the districts where no bore well data are available. The assessment of the results shows that the proposed GRNN-based model is capable of estimating the groundwater ﬂ uctuation both in the areas where bore well data are available and the areas where bore well data are not available. The study shows that there is a sharp decline in the groundwater level in the hilly districts of the state.


INTRODUCTION
Groundwater has been extensively used in many parts of India (Khan et al. ; Iqbal et al. ) and is considered as the backbone resource for agricultural activities of the country. However, the groundwater table has been depleted rapidly in many parts of the country as a result of unplanned exploitation of groundwater (Tiwari et  ations of the state of Uttarakhand. It has been observed that the fluctuation of groundwater also depends on the rainfall and the temperature. Therefore, rainfall and temperature data are also used along with the GRACE anomaly data to estimate the fluctuation of rainfall. The input to the GRNN model is, therefore, the GRACE anomaly data, average rainfall, and the average temperature of the area, and the model output is the fluctuation of the groundwater level. For developing the GRNN model, we need the observed groundwater fluctuation data. However, for some parts of the state of Uttarakhand, we do not have the observed groundwater fluctuation data. As such, we proposed a correlation-based GRNN model to estimate the groundwater fluctuation of those districts. The validation of the model shows that the proposed GRNN model is capable of estimating the groundwater fluctuation and can be considered as a reliable model.

Study area
Uttarakhand lies between 28 43 0 -31 27 0 N latitude and 77 34 0 -81 02 0 E longitude ( Figure 1). The state spreads over an area of about 53,483 km 2 and has a diverse hydrogeological structure. The whole region of the state is divided into two distinguishing hydrogeological regimes, i.e., the Gangetic alluvial plain and the Himalayan mountain belt. The Gangetic alluvial plain is covered with a wide range of alluvium and unconsolidated sedimentary material of different size fractions (ranging from boulder to clay) and is a likely zone for groundwater development. The Himalayan mountain belt, being largely hilly, has less capacity for large-scale development of groundwater. Groundwater in the hilly region occurs mostly in fractures and rises as springs.
There are 13 districts in Uttarakhand, and out of these 13 districts, only 5 districts have groundwater observation data. The locations of the groundwater observation wells and the districts where no bore well data are available are shown in Figure 2.
The data related to water fluctuation in these wells are downloaded from India-WRIS (Water Resource Information System). The groundwater data are available season-wise, and the data have been classified into four seasons, as shown in Table 1.

General regression neural network (GRNN)
We used the GRNN model to estimate the groundwater fluctuation by using the GRACE anomaly data, rainfall, and temperature. GRNN, as explained by Specht (Chinnasamy et al. ), is a neural network-based function predicting algorithm (Meena ; Joshi ). It requires no data for the iterative training, which makes GRNN a more preferable algorithm than other neural network models (Mohanty et al.

;
Akhter & Ahmad ). GRNN has the capability to approximate any arbitrary continuous nonlinear functions by using the training data. The model is consistent with nature and works on the nonlinear regression concept (Nyatuame et al. ; Yang et al. ). The regression of a dependent variable y on an independent variable x estimates the most probable value for y, given x and a training set. The training set consists of pairs of matches x and y.
The fundamental principle of a neural network is that it needs training data to train the network. During the training, the network learns the hidden relationship associated with the input and output data. The training data should contain input-output datasets (Kannan & Ghosh ; Andrew et al. ). The network is then tested using a different dataset.
Once the training and testing of the network are over, the model can be used to predict the output based on the given input to the network. In the case of GRNN, the new output is determined using a weighted average of the outputs of the training dataset. The weight of a particular pattern of the training dataset is estimated using the Euclidean distance between the pattern and the training data. If the distance is large, then the weight will be less, and if the distance is small, the pattern will have more weightage to the    As discussed above, the Euclidean distance of the inputs is the basis for calculating the weight to be assigned to a particular pattern. The equation for calculation of the new output according to the new input, as based on training datasets, is given in Equation (1): where d i is the Euclidean distance between new input (X ) and the training input (X i ), σ is the spread, which enhances the distance factor such that the increase in the value of σ decreases the chance of output near to the extreme values of the inputs and vice versa. Hence, σ is called the spread of the network. If the value of the distance d i is small, the weight term returns a relatively large value and vice versa.
If d i is zero, the weight term returns 1, which means test data will be equal to the training sample, and the output of test data will be the output of the training sample.
The GRNN model has only one parameter to be estimated, i.e., the spread value (σ). The training procedure is to find out the optimum value of σ. The best practice for finding the spread value is the use of an optimization algorithm by minimizing the mean squared error (MSE). For obtaining the optimal value of spread, the whole dataset is divided into two parts, the training sample, and the test sample. Then, GRNN is applied to the test data based on training data, and the optimal value of the spread is obtained by minimizing the MSE. April. Thus, an input pattern (k) can be represented as: The corresponding output pattern can be represented as: The GRNN model used in this study is shown in Figure 4. The input pattern X i k represents any arbitrary pattern k for the district i where recorded groundwater data are available. The distance (d i n ) between the input and training patterns are calculated in the pattern layers and also calculated is the weight (w i n ) of the particular pattern. There is N number of training patterns. As such, there will be N   After obtaining the correlation matrix between each known groundwater known data districts to a known one, where i represents the known district and varying 1-5, j represents the unknown district and varying 1-8, R G i,j is the correlation between GRACE data of the i th known district and the j th unknown district, R T i,j is the correlation between temperature data of the i th known district and the j th unknown district, R P i,j is the correlation between precipitation data of the i th known district and the j th unknown district. We provide equal weightage to prevent the domination of one input over another.  (4). X n j is an arbitrary pattern, as described in Equation (2), for the j th district where groundwater data has not been observed. This pattern will now move to all the five GRNN models developed earlier, and the output from the GRNN model will be GW n ij . Now, the output of the GRNN model will move to the numerator neuron after multiplying the value by the average coefficient of correlation between i and j calculated earlier. Similarly, the average coefficient of correlation between i and j will also be passed to the denominator neuron. The signal received from the numerator and denominator neurons will be summed up, and the output will be calculated at the output neuron.

Different input used in the formation of the GRNN model
The underlying assumption of this study is that the GRACE TWS may work as a valuable predictor for water level changes in the absence of consistent field-based observation data for water level. Now, the critical point was the selection of input data and their correlation. In the present study, we used frequently available data like temperature, precipitation and then established an association with GRACE.

GRACE data
The measurement of water storage in a comprehensive way for different storage compartments is a challenging one.
With the development of remote sensing techniques, the space-based observations of the Earth system have provided      The total water storage (TWS) data for the different districts of Uttarakhand were downloaded. Figure 8 shows the variation of TWS of the 13 different districts of Uttarakhand. The peak monthly rainfall varies from 1,155 to 250 mm, and there is less rainfall in winter.     Since the number of wells is greater in Dehradun, Haridwar, and US Nagar (as shown in Figure 2), the prediction of the model is quite good in these districts.  there is a good correlation between the parameters.
After calculation of the correlation value for each input of the GRNN model for each district, the average value is calculated and given in Table 7.
Using the correlation matrix value given in Table 8 Table 8. As observed, all R 2 values are more than 0.6, and hence it can be stated that the CGRNN model is capable of predicting groundwater fluctuation for the districts without groundwater records. To check the feasibility of the model, other statistical parameters like RMSE, NSE were also used, and the results are shown in Table 9. Each term is explained below.

Root mean square error (RMSE)
The root mean square deviation (RMSD) or root mean

Coefficient of determination (R 2 )
The coefficient of determination (R 2 ) shows the intensity and control of a linear relationship between two variables (Kazmi et al. ). The correlation is þ1 in the case of a perfect increasing linear relationship and À1 in the case of a decreasing linear relationship: where, O i and P i is observed and simulated value, n is the total number of test data and P i , O i are the mean value.

Nash-Sutcliffe coefficient (E)
The Nash-Sutcliffe model efficiency coefficient (E) is commonly used to assess the predictive power of hydrological discharge models (Moriasi et al. ; Chen et al. ). It where X obs are observed values, and X model are modeled values at time/place i. Nash-Sutcliffe efficiencies can range from À∞ to 1. An efficiency of 1 (E ¼ 1) corresponds to a perfect match between model and observations.      It can be observed from the plots that the groundwater level is decreasing continuously in hilly areas, especially in the border areas with Nepal and China. In plain areas like Dehradun and Haridwar, the depletion of groundwater is not so fast. This shows that there is sufficient recharge along with the withdrawal of water from the aquifer. As   streams, ponds, etc., have reportedly reduced by more than 50%. It was remarked that the situation is going to be worse in the future due to climate change. It has been seen that the result of the present study also matches with the findings of the UNDP report. In the present study, we A similar trend has also been observed during the postmonsoon (Kharif and rabi) season, as shown in Figures 21 and 22, respectively.
The variation of groundwater, as shown in Figures 19-22

DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.