## Abstract

Reference evapotranspiration (ET_{o}) is a major component of the hydrological cycle linking the irrigation water requirement and planning and management of water resources. In this research, the potential of co-active neuro-fuzzy inference system (CANFIS) was investigated against the multilayer perceptron neural network (MLPNN), radial basis neural network (RBNN), self-organizing map neural network (SOMNN) and multiple linear regression (MLR) to estimate the monthly ET_{o} at Pantnagar and Ranichauri stations, located in the foothills of Indian central Himalayas of Uttarakhand State, India. The significant combination of input variables for implemented techniques was decided by the Gamma test (GT). The results obtained by CANFIS models were compared with MLPNN, RBNN, SOMNN and MLR models based on performance evaluation indicators and visual inspection using line, scatter and Taylor plots for both the stations. The results of comparison revealed that CANFIS-5/CANFIS-9 models (RMSE = 0.0978/0.1394, SI = 0.0261/0.0475, COE = 0.9963/0.9846, PCC = 0.9982/0.9942 and WI = 0.9991/0.9959) with three and five input variables provide superior results for estimating monthly ET_{o} at Pantnagar and Ranichauri stations, respectively. Also, the adopted modelling strategy can build a truthful expert intelligent system for estimating the monthly ET_{o} at the study stations.

## INTRODUCTION

Estimation of reference evapotranspiration (ET_{o}) is essential for hydrologic water balance, irrigation system design and management, crop yield simulation, and planning and management of water resources (Sentelhas *et al.* 2010). Normally, ET_{o} is measured directly using lysimeters or also by eddy covariance, while indirectly calculated using the meteorological data. However, measurement of ET_{o} by lysimeters is more time-consuming and needs precise and accurate planned experiments (Lopez-Urrea *et al.* 2006). The Food and Agriculture Organization (FAO-56) standardized Penman–Monteith (PM) equation was developed for estimating ET_{o}, by referring a hypothetical crop with an assumed height of 0.12 m, surface resistance of 70 s·m^{−1} and an albedo of 0.23 (Allen *et al.* 1998). FAO-56 PM equation requires numerous meteorological variables for effective application which may be unavailable or missing in some locations, especially in developing countries. Therefore, alternative approaches that require less meteorological inputs are needed (Kumar *et al.* 2002; Yassin *et al.* 2016; Kisi & Alizamir 2018).

In recent years, soft computing techniques have been successfully applied for modelling reference evapotranspiration using various available meteorological variables (Sudheer *et al.* 2003; Kisi 2007; Zanetti *et al.* 2007; Kim & Kim 2008; Kumar *et al.* 2008; Rahimikhoob 2010; Tabari & Talaee 2012; Shiri *et al.* 2013, 2019; Falamarzi *et al.* 2014; Patil & Deka 2015; Kisi & Demir 2016; Banda *et al.* 2017; Gavili *et al.* 2018; Karbasi 2018; Kisi & Alizamir 2018; Tao *et al.* 2018; Shiri 2018, 2019a, 2019b; Granata 2019). Kisi *et al.* (2015) examined the comparative potential of multilayer perceptron artificial neural networks (MLP-ANN), adaptive neuro-fuzzy inference system (ANFIS) with grid partition (ANFIS-GP), ANFIS with subtractive clustering (ANFIS-SC) and genetic expression programming (GEP) in predicting monthly reference evapotranspiration from 50 stations located in Iran. They concluded that the ANFIS-GP models performed better than the other models. Mattar & Alazba (2018) employed GEP and MLR techniques in modelling monthly reference evapotranspiration from 27 stations located in Egypt and found that the GEP models are more powerful models. Pour *et al.* (2018) used support vector machine (SVM), ANFIS and GEP approaches for modelling monthly potential evapotranspiration in Iran and found that the SVM models performed better than the ANFIS and GEP models. Sanikhani *et al.* (2018a) investigated the comparative potential of MLP, generalized regression neural network (GRNN), RBNN, ANFIS-GP, ANFIS-SC, GEP and Hargreaves–Samani (HS) techniques in modelling monthly reference evapotranspiration at Antalya and Isparta stations located in Turkey. The results of comparison showed that the GEP and GRNN models estimated successfully at Antalya station, while the RBNN and ANFIS-SC models performed better than the other models at Isparta station. Granata (2019) estimated daily ET_{o} in central Florida, USA, using M5P regression tree, bagging, random forest (RF) and support vector regression (SVR) models. He found that the M5P regression tree model with inputs net solar radiation, sensible heat flux, moisture content of the soil, wind speed, mean relative humidity and mean temperature performed superior to the other models for daily ET_{o} estimation. Shiri (2019a, 2019b) compared the performance of gene expression programming (GEP), against temperature-based (HS), radiation/humidity-based (Priestley–Taylor, Makkink and Turc), and mass-transfer based (Dalton, Trabert and Penman) equations in modelling monthly ET_{o} at five meteorological locations of Iran. The results of analysis revealed that the GEP models (mass transfer-based) performed better than the other models.

Furthermore, the number of relevant applications of soft computing and time series methods have been found in modelling various hydrological processes, such as river flow forecasting (Yaseen *et al.* 2015a, 2018a, 2019; Papacharalampous *et al.* 2019); forecasting air temperature (Sanikhani *et al.* 2018c); soil temperature estimation (Sanikhani *et al.* 2018b); stream-flow forecasting (Yaseen *et al.* 2015b, 2016a, 2016b, 2017; Granata *et al.* 2018); suspended sediment load estimation (Khosravi *et al.* 2018; Kisi & Yaseen 2019); rainfall pattern forecasting (Yaseen *et al.* 2018b); flood forecasting (Solomatine & Xue 2004; Singh *et al.* 2010); soil moisture estimation (Ahmad *et al.* 2010); rainfall–runoff modelling (Granata *et al.* 2016); and predicting monthly temperature and precipitation using stochastic models (Papacharalampous *et al.* 2018).

This research utilized the Gamma test (GT) for selecting the significant combination of input variables in modelling monthly ET_{o} at study stations. In the recent decade, on global scale, several studies have been conducted using GT in diverse fields such as modelling daily pan-evaporation (Moghaddamnia *et al.* 2008; Piri *et al.* 2009; Goyal *et al.* 2014; Malik *et al.* 2017b, 2018; Ashrafzadeh *et al.* 2018); rainfall–runoff and river flow modelling (Remesan *et al.* 2009; Malik & Kumar 2018; Singh *et al.* 2018); and suspended sediment load (SSL) modelling (Kakaei-Lafdani *et al.* 2013; Malik *et al.* 2017a). Moghaddamnia *et al.* (2008) decided on an optimal input variables combination for ANN and ANFIS approaches using GT for modelling daily evaporation in an arid and semi-arid region of Iran. They reported the high ability of the ANN and ANFIS models in daily evaporation estimation. Kakaei-Lafdani *et al.* (2013) utilized the GT to select the most influential inputs for ANN and SVM for SSL estimation in Doiraj River, Iran and found that the ANN and SVM models perform better with the selected inputs using GT. Malik *et al.* (2017a) applied GT for selecting the appropriate input variables combination for CANFIS, MLPNN, MLR and multiple non-linear regression (MNLR) techniques in simulating daily SSL in Pranhita river basin, India. They found that the CANFIS model performs better than the other models using its selected input combination by the GT. Malik *et al.* (2018) applied RBNN, SOMNN, MLR, Penman, Stephens-Stewart, Griffiths, Christiansen, Priestley–Taylor and Jensen–Burman–Allen approaches for estimation of daily evaporation at Pantnagar, India. The significant combination of input variables for RBNN, SOMNN and MLR was selected using the GT. They found the RBNN model performed better than the other models.

So far, no literature has been found that utilized the GT for identifying the optimal input combination for CANFIS, MLPNN, RBNN, SOMNN and MLR in monthly reference evapotranspiration estimation. In view of the above-mentioned literature, this study was conducted with the following specific objectives: (i) to decide the significant input variable combination for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches by utilizing the GT; (ii) to compare the accuracy of CANFIS, models with MLPNN, RBNN, SOMNN and MLR models based on various performance indicators and visual inspection for ET_{o} estimation at Pantnagar and Ranichauri stations.

## MATERIALS AND METHODS

### Study area and data acquisition

The study was conducted at Pantnagar (29° 0′ 0″ N latitude, 79° 38′ 0″ E longitude) with an altitude of 243.8 m above mean sea level (msl) and Ranichauri (30° 18′ 40″ N latitude, 78° 24′ 35″ E longitude) with an altitude of 2,000 m above msl (Figure 1). Both stations are located in the foothills of Indian central Himalayan region of Uttarakhand State, India. The average annual rainfall at Pantnagar station is approximately 1,400 mm, while at Ranichauri station it is approximately 1,176 mm. The monthly meteorological data of minimum and maximum air temperatures (T_{min} and T_{max}), wind speed (U_{s}), relative humidity (RH), and solar radiation (R_{s}), were collected from Crop Research Centre (CRC) of Pantnagar and Ranichauri stations located in Uttarakhand State, India. The available data of 32 years (January, 1985–December, 2016) for Pantnagar station and 19 years (January, 1994–December, 2012) for Ranichauri station are presented graphically in Figure 2 using a box and whisker plot. The box and whisker plot provides the information about minimum, first quartile, median, third quartile and maximum values (start interpretation from bottom to top) of meteorological variables.

### FAO-56 Penman–Monteith equation

_{o}values using available meteorological data for Pantnagar and Ranichauri stations. In Cropwat 8.0 software, the FAO-56 Penman–Monteith (FAO-56 PM) equation is programmed and described as (Allen

*et al.*1998): where is reference evapotranspiration (mm/month), is slope of saturation vapour pressure curve (kPa °C

^{−1}),

*R*is net radiation (MJ/m

_{n}^{2}/month),

*G*is soil heat flux (MJ/m

^{2}/month), is psychrometric constant (kPa °C

^{−1}),

*e*and

_{s}*e*are saturated and actual vapour pressures (kPa),

_{a}*T*is average monthly air temperature (°C

^{−1}) and

*U*

_{2}is mean monthly wind speed at 2 m (m s

^{−1}).

### Gamma test

*et al.*(1997) introduced the concept of GT to reduce the problem of selection of appropriate input variables for modelling non-linear processes. Gamma test computes the minimum standard error (SE) for each input–output model by constructing the least-square linear regression line as (Piri

*et al.*2009; Rashidi

*et al.*2016): where,

*y*is output vector, G is gradient and gamma (Г) is intercept . The smaller value of the Г (closer to zero) indicates the more appropriate input variables. The gradient is taken into account as a complexity indicator of the applied model and high gradient reveals complicated model fitting. The standard error (SE) of Г shows the gamma value's reliability (i.e., small SE indicates reliable gamma value). The V

_{ratio}indicates the model's predictability for given input and output variables: where, is the variance of the output y and Г is the gamma function. If the value of V

_{ratio}is close to zero it represents the high predictability of the model. In general, a good model (input variable combination) is selected based on minimum value of Г, SE, G, and V

_{ratio}(Remesan

*et al.*2008; Moghaddamnia

*et al.*2009; Piri

*et al.*2009; Malik

*et al.*2017a, 2018; Ashrafzadeh

*et al.*2018).

In this study, the significant combination of input variables for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches was selected based on the minimum value of Г, SE, G and V_{ratio} for reference evapotranspiration estimation at Pantnagar and Ranichauri stations.

### Co-active neuro-fuzzy inference system

Jang *et al.* (1997) introduced the concept of co-active neuro-fuzzy inference system (CANFIS). The architecture of CANFIS is composed of five layers, fuzzification layer (fuzzy membership), rule layer (and multiplication), normalization layer, defuzzification layer (consequent) and summation layer (fuzzy association), and each input is processed through these five layers (Figure 3). The functioning of each layer and detailed information is given by Aytek (2009) and Tabari *et al.* (2012a).

*x*and

*y*) and one output (C), set with fuzzy IF-THEN rules for CANFIS architecture is as follows (Jang

*et al.*1997): where, and are the MFs for inputs

*x*and

*y*, respectively, and are the parameters in the consequent part (THEN-part) as illustrated in Figure 4.

**Layer 1 (Fuzzification layer or fuzzy membership):**Every node

*i*in this layer is a square node with a node function as: where,

*x*(or

*y*) is the input to node

*i*, and is a linguistic label (small, large, etc.) associated with this node function. In other words, is the membership function of and it specifies the degree to which the given

*x*(or

*y*) satisfies the quantifier. Usually, we choose or to bell-shaped with maximum equal to one and minimum equal to zero, such as: or the Gaussian function is written as: where, are the parameter set. As the values of these parameters change, the bell-shaped or Gaussian functions vary accordingly, thus exhibiting various forms of membership functions on the linguistic label . The parameters in this layer are referred to as

*premise parameters*.

Each node output represents the firing strength (weights) of a rule. T-norm operators that perform generalized AND can be used as the node function in this layer.

For convenience, the outputs of this layer are called *normalized firing strengths*.

In this research, the network of CANFIS was constructed using two to three Gaussian membership functions (MFs), Sugeno fuzzy model (Takagi & Sugeno 1985), hyperbolic tangent activation function (ranges from −1 to 1) for data normalization, and delta-bar-delta (DBD) learning algorithm with the threshold of 0.001 and 1,000 iterations in NeuroSolutions software.

### Multilayer perceptron neural network

Haykin (1998) first introduced multilayer perceptron neural network (MLPNN), which consists of layers of parallel processing elements (called neurons), with each layer being fully connected to the preceding layer by interconnection strengths or weights (W). Figure 4 illustrates a feed-forward MLPNN, consisting of an input layer (*i*), hidden layer (*j*) and output layer (*k*), with the inter-connection weights ( and ) between layers of neurons. One or more than one hidden layer exists between an input and an output layer. The number of hidden layers and neurons are specified by the problem to which the network is applied (i.e., the number of predictors and predictands, respectively). The hydrologist must specify the number of hidden layers and neurons for accurate mapping of all the training dataset. Initial estimated weight values are progressively corrected during a training process that compares predicted outputs to known outputs and backpropagates any errors (from right to left in Figure 4) to determine the appropriate weight adjustments necessary to minimize the errors (Dawson & Wilby 2001).

_{p}) is presented to the input layer of neurons (

*i*) and simply passed through unchanged output (O

_{p}), to be distributed to the second layer (

*j*). Each neuron in layers

*j*and

*k*receives the weighted sum of outputs (NET) from the previous layer as input. Mathematically, the

*NET*for layer

*j*is written as: where is the weight between the input layer and hidden layer, is the output of the input layer, is a bias for neuron

*j*. Each neuron in layers

*j*and

*k*produces its output by passing its value of

*NET*through a non-linear activation function. A commonly used functional form is the logistic activation function:

*k*th layer is obtained as: where

*f*is the activation function. All input and output values used in training and testing are scaled through the activation function. In this study, the architecture of MLPNN was designed by supervised learning with one input layer, one hidden layer and one output layer with a single output. The hyperbolic tangent activation function (range −1 to 1) was used for data normalization with DBD learning algorithm because this technique is more powerful and faster than the conventional gradient descent. The optimal number of neurons in the hidden layer were determined using 2

*n +*1 concept given by Mishra & Desai (2006), where

*n*is the number of inputs. The MLPNN training was stopped after 1,000 epochs with a threshold value of 0.001 in NeuroSolutions 5.0 software.

### Radial basis neural network

The radial basis neural network (RBNN) is a branch of ANN, and the concept of RBNN was given by Bishop (1995). The radial basis functions (RBF) of the RBNN serve as the activation function. The architecture of RBNN is composed of three layers (input layer, hidden layer and output layer). The functioning of the hidden layer is to create clusters and then outputs are multiplied by the weights; these multiplied weights are incorporated into the hidden layer using the multivariate interpolation functions. In this study, different quantities of hidden layer neurons and spread constants were analysed using a trial-and-error method in NeuroSolutions software. The functioning of each layer and more insightful details of RBNN are given by Bishop (1995), Kisi (2009) and Banda *et al.* (2017).

### Self-organizing map neural network

Kohonen (1982) discovered the self-organizing map neural network (SOMNN) algorithm for effective and clustering analysis. The structure of SOMNN is composed of three steps, (i) initialization: in this step, any random value of initial weights is chosen; (ii) winner finding: this step involves finding the optimal weight vectors with the neuron (winner weights); (iii) weight updating: this step involves adjustment of the winner weights and neighbourhood neurons towards the input vectors. More detailed and theoretical insightful information about SOMNN is given by Chang *et al.* (2007, 2010). In this study, the size of SOMNN, i.e., rows × columns matrix, neighbourhood shape, hidden layer, processing elements, activation function and learning algorithm were decided using a trial-and-error procedure in NeuroSolutions software, to train and test the obtained networks based on training and testing datasets, respectively.

### Multiple linear regression

_{o}is the dependent variable (output); are the independent variables; is the intercept; and are the regression coefficients.

### Model performance evaluation indicators

*et al.*2018), COE (Nash & Sutcliffe 1970), PCC (Malik

*et al.*2018; Singh

*et al.*2018), WI (Willmott 1981) and RE (Tao

*et al.*2018; Yaseen

*et al.*2018a) are described as: where, and are observed and estimated monthly ET

_{o}values for the

*i*th dataset,

*N*is total number of observations in a dataset, and are mean of observed and estimated monthly ET

_{o}values, respectively, is absolute difference between estimated and observed mean, and is absolute difference between observed values and their mean. The model that had higher value of COE, PCC and WI, and lower RMSE and SI values was adjudged relatively the better model for monthly ET

_{o}estimation at study stations.

## RESULTS

### Deciding optimal input variables using GT

The descriptive statistics of training and testing datasets are provided in Table 1 for the Pantnagar and Ranichauri stations. From Table 1, it was observed that the Pantnagar station has higher temperatures compared to Ranichauri station. T_{min}, T_{max} and RH have negative skewness while U_{s}, R_{s} and ET_{o} show positive skewness in training and testing datasets of both stations. The solar radiation has the least skewness at Pantnagar station and it follows a normal distribution. The kurtosis was generally found to be platykurtic (−) except the U_{s} data which are leptokurtic (+) in nature in training datasets for both the stations. The cross-correlation between input and output variables are given in Table 2 for Pantnagar and Ranichauri stations. As observed from Table 2 for Pantnagar and Ranichauri stations, significant correlation was found between T_{min} and T_{max}, U_{s}, R_{s}, and ET_{o}; T_{max} and RH, U_{s} and ET_{o}; RH and U_{s}, R_{s} and ET_{o}; U_{s} and R_{s} and ET_{o} at 5% significance level, respectively.

Statistical parameters | Monthly meteorological variables with unit | |||||
---|---|---|---|---|---|---|

T_{min} (°C) | T_{max} (°C) | RH (%) | U_{s} (m/s) | R_{s} (MJ/m^{2}) | ET_{o} (mm) | |

Pantnagar (Training dataset: January 1, 1985–December 31, 2010) | ||||||

Minimum | 4.30 | 14.50 | 36.00 | 0.20 | 8.60 | 1.14 |

Maximum | 26.70 | 40.00 | 89.00 | 4.00 | 25.60 | 8.26 |

Mean | 16.82 | 29.70 | 67.38 | 1.40 | 17.62 | 3.90 |

SD | 7.10 | 5.59 | 12.19 | 0.70 | 4.08 | 1.71 |

Skewness | −0.15 | −0.39 | −0.61 | 0.78 | 0.02 | 0.53 |

Kurtosis | −1.53 | −0.71 | −0.39 | 0.37 | −0.77 | −0.57 |

Pantnagar (Testing dataset: January 1, 2011–December 31, 2016) | ||||||

Minimum | 5.80 | 17.00 | 42.00 | 0.60 | 8.20 | 1.23 |

Maximum | 26.30 | 40.10 | 87.00 | 2.80 | 24.70 | 7.63 |

Mean | 17.27 | 29.77 | 68.71 | 1.38 | 16.38 | 3.74 |

SD | 7.08 | 5.80 | 10.91 | 0.49 | 4.40 | 1.62 |

Skewness | −0.13 | −0.51 | −0.78 | 0.36 | −0.05 | 0.43 |

Kurtosis | −1.57 | −0.54 | −0.29 | −0.03 | −0.82 | −0.55 |

Ranichauri (Training dataset: January 1, 1994–December 31, 2008) | ||||||

Minimum | 0.50 | 8.50 | 29.00 | 1.00 | 9.60 | 1.23 |

Maximum | 17.80 | 28.50 | 94.00 | 1.80 | 25.40 | 5.33 |

Mean | 9.87 | 19.55 | 70.43 | 1.42 | 16.37 | 2.93 |

SD | 5.32 | 4.83 | 14.04 | 0.15 | 4.15 | 1.06 |

Skewness | −0.16 | −0.50 | −0.15 | 0.13 | 0.44 | 0.42 |

Kurtosis | −1.37 | −0.90 | −0.69 | 0.11 | −0.73 | −0.70 |

Ranichauri (Testing dataset: January 1, 2009–December 31, 2012) | ||||||

Minimum | 0.50 | 9.30 | 37.00 | 0.40 | 10.50 | 1.13 |

Maximum | 16.70 | 29.10 | 93.00 | 1.60 | 24.60 | 5.42 |

Mean | 9.83 | 20.16 | 67.38 | 1.05 | 16.38 | 2.93 |

SD | 5.34 | 4.67 | 15.41 | 0.40 | 4.09 | 1.13 |

Skewness | −0.20 | −0.36 | 0.16 | −0.37 | 0.57 | 0.59 |

Kurtosis | −1.43 | −0.54 | −0.89 | −1.28 | −0.74 | −0.48 |

Statistical parameters | Monthly meteorological variables with unit | |||||
---|---|---|---|---|---|---|

T_{min} (°C) | T_{max} (°C) | RH (%) | U_{s} (m/s) | R_{s} (MJ/m^{2}) | ET_{o} (mm) | |

Pantnagar (Training dataset: January 1, 1985–December 31, 2010) | ||||||

Minimum | 4.30 | 14.50 | 36.00 | 0.20 | 8.60 | 1.14 |

Maximum | 26.70 | 40.00 | 89.00 | 4.00 | 25.60 | 8.26 |

Mean | 16.82 | 29.70 | 67.38 | 1.40 | 17.62 | 3.90 |

SD | 7.10 | 5.59 | 12.19 | 0.70 | 4.08 | 1.71 |

Skewness | −0.15 | −0.39 | −0.61 | 0.78 | 0.02 | 0.53 |

Kurtosis | −1.53 | −0.71 | −0.39 | 0.37 | −0.77 | −0.57 |

Pantnagar (Testing dataset: January 1, 2011–December 31, 2016) | ||||||

Minimum | 5.80 | 17.00 | 42.00 | 0.60 | 8.20 | 1.23 |

Maximum | 26.30 | 40.10 | 87.00 | 2.80 | 24.70 | 7.63 |

Mean | 17.27 | 29.77 | 68.71 | 1.38 | 16.38 | 3.74 |

SD | 7.08 | 5.80 | 10.91 | 0.49 | 4.40 | 1.62 |

Skewness | −0.13 | −0.51 | −0.78 | 0.36 | −0.05 | 0.43 |

Kurtosis | −1.57 | −0.54 | −0.29 | −0.03 | −0.82 | −0.55 |

Ranichauri (Training dataset: January 1, 1994–December 31, 2008) | ||||||

Minimum | 0.50 | 8.50 | 29.00 | 1.00 | 9.60 | 1.23 |

Maximum | 17.80 | 28.50 | 94.00 | 1.80 | 25.40 | 5.33 |

Mean | 9.87 | 19.55 | 70.43 | 1.42 | 16.37 | 2.93 |

SD | 5.32 | 4.83 | 14.04 | 0.15 | 4.15 | 1.06 |

Skewness | −0.16 | −0.50 | −0.15 | 0.13 | 0.44 | 0.42 |

Kurtosis | −1.37 | −0.90 | −0.69 | 0.11 | −0.73 | −0.70 |

Ranichauri (Testing dataset: January 1, 2009–December 31, 2012) | ||||||

Minimum | 0.50 | 9.30 | 37.00 | 0.40 | 10.50 | 1.13 |

Maximum | 16.70 | 29.10 | 93.00 | 1.60 | 24.60 | 5.42 |

Mean | 9.83 | 20.16 | 67.38 | 1.05 | 16.38 | 2.93 |

SD | 5.34 | 4.67 | 15.41 | 0.40 | 4.09 | 1.13 |

Skewness | −0.20 | −0.36 | 0.16 | −0.37 | 0.57 | 0.59 |

Kurtosis | −1.43 | −0.54 | −0.89 | −1.28 | −0.74 | −0.48 |

*Notes*: T_{min} and T_{max}, minimum and maximum air temperatures (°C); RH, relative humidity (%); U_{s}, wind speed (m/s); R_{s}, solar radiation (MJ/m^{2}); ET_{o}, reference evapotranspiration (mm); SD, standard deviation.

Station/Variables | T_{min} (°C) | T_{max} (°C) | RH (%) | U_{s} (m/s) | R_{s} (MJ/m^{2}) | ET_{o} (mm) |
---|---|---|---|---|---|---|

Pantnagar | ||||||

T_{min} (°C) | 1.0 | |||||

T_{max} (°C) | 0.834* | 1.0 | ||||

RH (%) | 0.041 | −0.468* | 1.0 | |||

U_{s} (m/s) | 0.476* | 0.544* | −0.396* | 1.0 | ||

R_{s} (MJ/m^{2}) | 0.621* | 0.883* | −0.648* | 0.606* | 1.0 | |

ET_{o} (mm) | 0.719* | 0.919* | −0.610* | 0.764* | 0.933* | 1.0 |

Ranichauri | ||||||

T_{min} (°C) | 1.0 | |||||

T_{max} (°C) | 0.862* | 1.0 | ||||

RH (%) | −0.114 | −0.489* | 1.0 | |||

U_{s} (m/s) | 0.407* | 0.456* | −0.320* | 1.0 | ||

R_{s} (MJ/m^{2}) | 0.727* | 0.883* | −0.619* | 0.525* | 1.0 | |

ET_{o} (mm) | 0.918* | 0.934* | −0.359* | 0.507* | 0.921* | 1.0 |

Station/Variables | T_{min} (°C) | T_{max} (°C) | RH (%) | U_{s} (m/s) | R_{s} (MJ/m^{2}) | ET_{o} (mm) |
---|---|---|---|---|---|---|

Pantnagar | ||||||

T_{min} (°C) | 1.0 | |||||

T_{max} (°C) | 0.834* | 1.0 | ||||

RH (%) | 0.041 | −0.468* | 1.0 | |||

U_{s} (m/s) | 0.476* | 0.544* | −0.396* | 1.0 | ||

R_{s} (MJ/m^{2}) | 0.621* | 0.883* | −0.648* | 0.606* | 1.0 | |

ET_{o} (mm) | 0.719* | 0.919* | −0.610* | 0.764* | 0.933* | 1.0 |

Ranichauri | ||||||

T_{min} (°C) | 1.0 | |||||

T_{max} (°C) | 0.862* | 1.0 | ||||

RH (%) | −0.114 | −0.489* | 1.0 | |||

U_{s} (m/s) | 0.407* | 0.456* | −0.320* | 1.0 | ||

R_{s} (MJ/m^{2}) | 0.727* | 0.883* | −0.619* | 0.525* | 1.0 | |

ET_{o} (mm) | 0.918* | 0.934* | −0.359* | 0.507* | 0.921* | 1.0 |

*Statistically significant correlation at 5% level of significance.

Deciding significant model inputs is a tedious procedure, especially for the non-linear hydrological processes. In this study, the nine combinations of available inputs were considered to evaluate their effects on monthly ET_{o} (Table 3). The GT test was applied to the full dataset for identifying the significant combination of input variable for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches. The test statistics of the GT were reported in Table 4, which revealed minimum value of Г = 0.0240, G = 0.0062, SE = 0.0044, V_{ratio} = 0.0084 with the mask 01011 for Pantnagar station (Figure 5(a)), and Г = −0.0022, G = 0.0039, SE = 0.0011, V_{ratio} = −0.0018 with the mask 11111 for Ranichauri station (Figure 5(b)). The term mask defines the number of effective input variables used, for instance, the mask 01011 and 11111 indicates three and five input variables are used at a time to give an output, i.e., ET_{o} at Pantnagar and Ranichauri stations, respectively. Therefore, the appropriate input combination of three variables (T_{max}, U_{s}, R_{s}) was used for the CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models at Pantnagar station, while five variables (T_{min}, T_{max}, RH, U_{s}, R_{s}) were used for CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models at Ranichauri station to estimate monthly ET_{o}.

Climatic variables | CANFIS/MLPNN/RBNN/SOMNN/MLR | ||||||||
---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |

T_{min} | ✓ | ✓ | |||||||

T_{max} | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |

RH | ✓ | ✓ | ✓ | ✓ | ✓ | ||||

U_{s} | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||

R_{s} | ✓ | ✓ | ✓ | ✓ | ✓ |

Climatic variables | CANFIS/MLPNN/RBNN/SOMNN/MLR | ||||||||
---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |

T_{min} | ✓ | ✓ | |||||||

T_{max} | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |

RH | ✓ | ✓ | ✓ | ✓ | ✓ | ||||

U_{s} | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||

R_{s} | ✓ | ✓ | ✓ | ✓ | ✓ |

Various input combinations | Parameters | ||||
---|---|---|---|---|---|

Γ | G | SE | V_{ratio} | Mask | |

Pantnagar | |||||

T_{max} | 0.2011 | 0.0568 | 0.0083 | 0.0702 | 01000 |

T_{max}, U_{s} | 0.0609 | 0.1514 | 0.0047 | 0.0212 | 01010 |

T_{max}, R_{s} | 0.1185 | 0.0067 | 0.0096 | 0.0414 | 01001 |

T_{max}, RH, U_{s} | 0.0847 | 0.0305 | 0.0099 | 0.0296 | 01110 |

T_{max}, U_{s}, R_{s} | 0.0240 | 0.0062 | 0.0044 | 0.0084 | 01011 |

T_{max}, RH, R_{s} | 0.0909 | 0.0103 | 0.0168 | 0.0318 | 01101 |

T_{min}, T_{max}, RH, U_{s} | 0.0855 | 0.0118 | 0.0080 | 0.0298 | 11110 |

T_{max}, RH, U_{s}, R_{s} | 0.0454 | 0.0147 | 0.0106 | 0.0158 | 01111 |

T_{min}, T_{max}, RH, U_{s}, R_{s} | 0.0492 | 0.0451 | 0.0136 | 0.0172 | 11111 |

Ranichauri | |||||

T_{max} | 0.1375 | 0.0402 | 0.0099 | 0.1185 | 01000 |

T_{max}, U_{s} | 0.1204 | 0.0468 | 0.0058 | 0.1038 | 01010 |

T_{max}, R_{s} | 0.0097 | 0.0086 | 0.0031 | 0.0084 | 01001 |

T_{max}, RH, U_{s} | 0.0288 | 0.0149 | 0.0147 | 0.0248 | 01110 |

T_{max}, U_{s}, R_{s} | 0.0065 | 0.0097 | 0.0012 | 0.0056 | 01011 |

T_{max}, RH, R_{s} | 0.0004 | 0.0056 | 0.0046 | 0.0004 | 01101 |

T_{min}, T_{max}, RH, U_{s} | 0.0287 | 0.0058 | 0.0061 | 0.0247 | 11110 |

T_{max}, RH, U_{s}, R_{s} | −0.0004 | 0.0056 | 0.0042 | −0.0003 | 01111 |

T_{min}, T_{max}, RH, U_{s}, R_{s} | −0.0022 | 0.0039 | 0.0011 | −0.0018 | 11111 |

Various input combinations | Parameters | ||||
---|---|---|---|---|---|

Γ | G | SE | V_{ratio} | Mask | |

Pantnagar | |||||

T_{max} | 0.2011 | 0.0568 | 0.0083 | 0.0702 | 01000 |

T_{max}, U_{s} | 0.0609 | 0.1514 | 0.0047 | 0.0212 | 01010 |

T_{max}, R_{s} | 0.1185 | 0.0067 | 0.0096 | 0.0414 | 01001 |

T_{max}, RH, U_{s} | 0.0847 | 0.0305 | 0.0099 | 0.0296 | 01110 |

T_{max}, U_{s}, R_{s} | 0.0240 | 0.0062 | 0.0044 | 0.0084 | 01011 |

T_{max}, RH, R_{s} | 0.0909 | 0.0103 | 0.0168 | 0.0318 | 01101 |

T_{min}, T_{max}, RH, U_{s} | 0.0855 | 0.0118 | 0.0080 | 0.0298 | 11110 |

T_{max}, RH, U_{s}, R_{s} | 0.0454 | 0.0147 | 0.0106 | 0.0158 | 01111 |

T_{min}, T_{max}, RH, U_{s}, R_{s} | 0.0492 | 0.0451 | 0.0136 | 0.0172 | 11111 |

Ranichauri | |||||

T_{max} | 0.1375 | 0.0402 | 0.0099 | 0.1185 | 01000 |

T_{max}, U_{s} | 0.1204 | 0.0468 | 0.0058 | 0.1038 | 01010 |

T_{max}, R_{s} | 0.0097 | 0.0086 | 0.0031 | 0.0084 | 01001 |

T_{max}, RH, U_{s} | 0.0288 | 0.0149 | 0.0147 | 0.0248 | 01110 |

T_{max}, U_{s}, R_{s} | 0.0065 | 0.0097 | 0.0012 | 0.0056 | 01011 |

T_{max}, RH, R_{s} | 0.0004 | 0.0056 | 0.0046 | 0.0004 | 01101 |

T_{min}, T_{max}, RH, U_{s} | 0.0287 | 0.0058 | 0.0061 | 0.0247 | 11110 |

T_{max}, RH, U_{s}, R_{s} | −0.0004 | 0.0056 | 0.0042 | −0.0003 | 01111 |

T_{min}, T_{max}, RH, U_{s}, R_{s} | −0.0022 | 0.0039 | 0.0011 | −0.0018 | 11111 |

*Notes*: Г, gamma; G, gradient; SE, standard error.

Station/Model | Testing period | ||||
---|---|---|---|---|---|

RMSE (mm/month) | SI | COE | PCC | WI | |

Pantnagar | |||||

CANFIS-5 | 0.0978 | 0.0261 | 0.9963 | 0.9982 | 0.9991 |

MLPNN-5 | 0.1082 | 0.0289 | 0.9955 | 0.9980 | 0.9989 |

RBNN-5 | 0.1972 | 0.0527 | 0.9850 | 0.9925 | 0.9962 |

SOMNN-5 | 0.1487 | 0.0397 | 0.9915 | 0.9960 | 0.9978 |

MLR-5 | 0.2445 | 0.0653 | 0.9770 | 0.9892 | 0.9943 |

Ranichauri | |||||

CANFIS-9 | 0.1394 | 0.0475 | 0.9846 | 0.9942 | 0.9959 |

MLPNN-5 | 0.1428 | 0.0487 | 0.9838 | 0.9936 | 0.9958 |

RBNN-9 | 0.3179 | 0.1085 | 0.9198 | 0.9665 | 0.9790 |

SOMNN-9 | 0.4616 | 0.1575 | 0.8310 | 0.9151 | 0.9495 |

MLR-9 | 0.1967 | 0.0671 | 0.9693 | 0.9887 | 0.9918 |

Station/Model | Testing period | ||||
---|---|---|---|---|---|

RMSE (mm/month) | SI | COE | PCC | WI | |

Pantnagar | |||||

CANFIS-5 | 0.0978 | 0.0261 | 0.9963 | 0.9982 | 0.9991 |

MLPNN-5 | 0.1082 | 0.0289 | 0.9955 | 0.9980 | 0.9989 |

RBNN-5 | 0.1972 | 0.0527 | 0.9850 | 0.9925 | 0.9962 |

SOMNN-5 | 0.1487 | 0.0397 | 0.9915 | 0.9960 | 0.9978 |

MLR-5 | 0.2445 | 0.0653 | 0.9770 | 0.9892 | 0.9943 |

Ranichauri | |||||

CANFIS-9 | 0.1394 | 0.0475 | 0.9846 | 0.9942 | 0.9959 |

MLPNN-5 | 0.1428 | 0.0487 | 0.9838 | 0.9936 | 0.9958 |

RBNN-9 | 0.3179 | 0.1085 | 0.9198 | 0.9665 | 0.9790 |

SOMNN-9 | 0.4616 | 0.1575 | 0.8310 | 0.9151 | 0.9495 |

MLR-9 | 0.1967 | 0.0671 | 0.9693 | 0.9887 | 0.9918 |

### Monthly ET_{o} estimation at Pantnagar station

_{o}at Pantnagar station. The performance of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models were evaluated statistically using RMSE, SI, COE, PCC and WI, and visually inspected using line, scatter, relative error and Taylor plots. The values of RMSE, SI, COE, PCC and WI of the applied models during the testing period are reported in Table 5. As clearly seen from Table 5, the CANFIS-5 model had the best performance (RMSE = 0.0978 mm/month, SI = 0.0261, COE = 0.9963, PCC = 0.9982 and WI = 0.9991) with two Gaussian MFs for each input, activation function of hyperbolic tangent. Increasing MFs' number did not improve the model performance. The various tests were made with MLPNN-5 using one, two and thee hidden layers with varying numbers of neurons in hidden layer. The MLPNN-5 with 3-5-1 architecture (three inputs, five neurons and one output, respectively) was found to be optimal for ET

_{o}estimation with RMSE = 0.1082 mm/month, SI = 0.0289, COE = 0.9958, PCC = 0.9980 and WI = 0.9989. The RBNN-5, SOMNN-5, and MLR-5 models have RMSE = 0.1972, 0.1487 and 0.2445 mm/month; SI = 0.0527, 0.0397 and 0.0653; COE = 0.9850, 0.9915 and 0.9770; PCC = 0.9925, 0.9960 and 0.9892; and WI = 0.9962, 0.9978 and 0.9943, respectively. Table 5 clearly showed that the CANFIS-5 model has lower RMSE and SI, and higher COE, PCC and WI than the MLPNN-5, SOMNN-5, RBNN-5 and MLR-5 models, respectively. The regression equation for the MLR-5 model with their intercepts and regression coefficients obtained from the training phase are written as: The temporal variation between the observed and estimated monthly ET

_{o}values of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during the testing period are plotted using line plot (left side) and scatter plot (right side) in Figure 6(a)–6(e), respectively. As observed from Figure 6(a)–6(e), the regression line and best fit line (1:1) for all the models were perfectly close to each other; however, these lines were closer to each other for CANFIS-5 (Figure 6(a)) and MLPNN-5 (Figure 6(b)) models, respectively. It can be concluded that based on the statistical and visual comparisons, the ranks of the model (from the best to worst) can be classified as the CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models for Pantnagar station, respectively.

Using Equation (23), the relative error percentage between observed and estimated monthly ET_{o} values of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during the testing period was calculated and presented graphically in Figure 7. As Figure 7 demonstrates, the RE percentage concentrated around the ±10% upper band (UB) and lower band (LB) for 100% (CANFIS-5), 98.61% (MLPNN-5), 81.94% (RBNN-5), 97.22% (SOMNN-5) and 81.94% (MLR-5) of testing dataset. Here, the CANFIS-5 model shows less percentage of RE as compared to the MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models in the testing dataset. It was also noticed that the maximum distribution of RE for all given models appeared in the peak value of monthly ET_{o}.

The spatial pattern of estimated and observed values' monthly ET_{o} by CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during the testing period was also evaluated by using the Taylor diagram (TD). Taylor (2001) provided a polar plot for acquiring a visual judgement of model performance. It has the ability to emphasize the accuracy and efficiency of models based on the observed values. The Taylor diagram exhibits three specific statistics (i.e., correlation coefficient, normalized standard deviation and RMSE). Figure 8 provides the Taylor diagram for observed and estimated values using the applied models at Pantnagar station. Figure 8 demonstrated that the CANFIS-5 and MLPNN-5 models provided the lower RMSE, lower standard deviation and higher correlation coefficient than the SOMNN-5, RBNN-5 and MLR-5 models. Hence, the CANFIS-5 model with selected inputs (T_{max}, U_{s}, R_{s}) can be used for modelling monthly ET_{o} at Pantnagar station.

### Monthly ET_{o} estimation at Ranichauri station

_{o}at Ranichauri station. The performance of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models was evaluated using RMSE, SI, COE, PCC and WI, and visually inspected using line, scatter, relative error and Taylor plots. The value of RMSE, SI, COE, PCC and WI of the applied models during the testing period is reported in Table 5. As observed from Table 5, the lowest RMSE = 0.1394 mm/month and SI = 0.0475, and the highest COE = 0.9846, PCC = 0.9942 and WI = 0.9959 were found for the CANFIS-9 model with two Gaussian MFs for each input. Table 5 clearly indicates that the CANFIS-9 model has a lower value of RMSE and SI, and higher COE, PCC and WI than the MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models, respectively. The RBNN-9, SOMNN-9 and MLR-9 models have RMSE = 0.3179, 0.4616 and 0.1967 mm/month; SI = 0.1085, 0.1575 and 0.0671; COE = 0.9198, 0.8310 and 0.9693; PCC = 0.9665, 0.9151 and 0.9887; and WI = 0.9790, 0.9495 and 0.9918, respectively. It should be noted that the MLPNN-9 model closely follows the CANFIS-9 model. The regression equation for the MLR-9 model with its intercepts and regression coefficients obtained from the training phase is written as:

The temporal variation between observed and estimated monthly ET_{o} values of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period is plotted using line plot (left side) and scatter plot (right side) in Figure 9(a)–9(e), respectively. As observed from the time variation graphs (line and scatter), the estimates of the CANFIS-9 were closer to the observed monthly ET_{o} values (PM FAO-56) compared to other models. It is clear from the scatterplots that the CANFIS-9 has less scattered estimates than the other models. It should be noted that all the models underestimated the peak ET_{o} values. This may be due to the fact that there is insufficient number of peak values in the training and applied methods cannot learn the process. The ranks of the models were the CANFIS-9, MLPNN-9, MLR-9, RBNN-9 and SOMNN-9 for Ranichauri station based on the statistical and visual comparisons, respectively.

The relative error percentage (Equation (23)) between observed and estimated monthly ET_{o} values of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period is presented in Figure 10. As Figure 10 demonstrates, the RE percentage was between ±10% upper band (UB) and lower band (LB) for 95.83% (CANFIS-9), 93.75% (MLPNN-9), 72.91% (RBNN-9), 75% (SOMNN-9) and 87.5% (MLR-9) of testing dataset. Here, the CANFIS-9 model shows lesser percentage of RE as compared to the MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models in the testing dataset. It also indicated that the maximum distribution of RE for all prescribed models appeared in the peak value of monthly ET_{o}.

The spatial pattern of observed and estimated values of monthly ET_{o} by CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period was also evaluated by using the Taylor diagram (TD). Figure 11 provides the TD for observed and estimated values using the applied models at Ranichauri station. Figure 11 illustrates that the CANFIS-9 model provided lower RMSE and higher correlation coefficient than the MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models. Hence, the CANFIS-9 model with selected inputs (T_{min}, T_{max}, RH, U_{s}, R_{s}) can be successfully used for monthly ET_{o} estimation at Ranichauri station.

## DISCUSSION

In this study, the feasibility of a hybrid model, i.e., co-active fuzzy inference system was evaluated for estimating monthly reference evapotranspiration at Pantnagar and Ranichauri stations. The CANFIS model is the integration of an artificial neural network and fuzzy inference system in a single topology. The significant combination of input variables was decided using a non-linear modelling tool, i.e., Gamma test. The estimates yielded by the CANFIS model were compared to the multilayer perceptron neural network, radial basis neural network, self-organizing map neural network and multiple linear regression models using statistical indicators such as root mean squared error, scatter index, COE, PCC, Willmott index, relative error, and visual basis using line plot, scatter plot, relative error plot and Taylor diagram for both the stations under study. The CANFIS model with Gaussian membership functions, Takagi–Sugeno–Kang fuzzy inference system, hyperbolic tangent activation function, and DBD learning algorithm improved substantially the performance of modelling by increasing the COE, PCC and WI and reducing the RMSE, SI and RE measurements at both the stations. The estimation accuracy of the MLPNN-5, RBNN-5, SOMM-5 and MLR-5 models with respect to RMSE was reduced by 10%, 50%, 34% and 60% using the CANFIS-5 models at Pantnagar station, while at Ranichauri station, the estimation accuracy of the MLPNN-9, RBNN-9, SOMM-9 and MLR-9 models with respect to RMSE was decreased by 2%, 56%, 70% and 29% by applying the CANFIS-9 model, respectively. The applied modelling strategy builds a standard and reliable intelligent system that can be used for Pantnagar and Ranichauri stations and is extremely valuable for water resources managers, agriculture and irrigation engineers and agronomists.

Over the past decade, throughout the world, studies have been conducted by researchers on reference evapotranspiration or pan-evaporation estimation using numerous artificial intelligence techniques (hybrid or simple) (Aytek 2009; Cobaner 2011; Karimaldini & Shui 2012; Tabari *et al.* 2012b; Shiri *et al.* 2014; Marti *et al.* 2015; Shrestha & Shukla 2015; Kumar *et al.* 2016; Ghorbani *et al.* 2017; Shiri 2017, 2019b; Landeras *et al.* 2018; Pour *et al.* 2018; Saggi & Jain 2018). Aytek (2009) applied the CANFIS for modelling daily reference evapotranspiration from three meteorological stations located in the USA. The performance of CANFIS model was compared to the California Irrigation Management Information System (CIMIS) Penman equation, Penman–Monteith equation, the HS equation and the Turc equation based on root mean square error, coefficient of efficiency, adjusted coefficient of efficiency and determination coefficient. The results of analysis revealed that the performance of CANFIS model with four inputs (solar radiation, mean temperature, relative humidity and wind speed) was superior to the other models. Tabari *et al.* (2012a) utilized the CANFIS and MLP model to predict the daily pan-evaporation in the semi-arid region of Iran. They reported that the CANFIS model outperformed the MLP model for daily pan-evaporation prediction in the study region. Ghorbani *et al.* (2017) estimated pan-evaporation using hybrid multilayer perceptron-firefly algorithm (MLP-FFA) in the north of Iran. The estimates of MLP-FFA model were compared to the traditional MLP and support vector machine (SVM) models using statistical indicators. They found that the hybrid model (MLP-FFA) performed better than the traditional models (MLP and SVM). Saggi & Jain (2018) examined the performance of deep learning-multilayer perceptrons (DL), generalized linear model (GLM), random forest (RF) and gradient-boosting machine (GBM) in H_{2}O environment in estimating the daily ET_{o} at Hoshiarpur and Patiala districts of Punjab, India. They reported that the DL models outperformed the other models for daily ET_{o} estimation in the study region. Shiri *et al.* (2019) applied the data splitting strategy to GEP for estimating the daily ET_{o} from five meteorological locations in northwestern Iran, and they found good results of GEP under different scenarios.

The above-reported studies also confirmed the supremacy of hybrid models in the estimation of reference evapotranspiration or pan-evaporation. Thus, this study provided conclusive evidence that the estimation of monthly reference evapotranspiration can be done effectively in order of accuracy by CANFIS-5 > MLPNN-5 > SOMNN-5 > RBNN-5 > MLR-5 models at Pantnagar station, and by CANFIS-9 > MLPNN-9 > MLR-9 > RBNN-9 > SOMNN-9 models at Ranichauri station.

## CONCLUSIONS

This study was conducted to examine the potential of CANFIS model against the MLPNN, RBNN, SOMNN and MLR models for estimating the monthly reference evapotranspiration at Pantnagar and Ranichauri stations, situated in foothills of Indian central Himalayan region of Uttarakhand State, India. For estimating monthly reference evapotranspiration at both the stations, the significant combination of input variables for CANFIS, MLPNN, RBNN, SOMNN and MLR models were decided based on the minimum value of gamma statistics (i.e., gamma, gradient, standard error and V_{ratio}) obtained by applying the Gamma test. The following specific conclusions were derived from this study:

The three input variable combination (maximum air temperature, wind speed and solar radiation) was selected as the most appropriate combination based on the GT for estimating monthly reference evapotranspiration at Pantnagar station.

The five input variable combination (minimum and maximum air temperatures, relative humidity, wind speed and solar radiation) was selected as the most appropriate combination based on the GT for estimating monthly reference evapotranspiration at Ranichauri station.

Based on performance indicators and a visual inspection, the performance of CANFIS model was found superior to the MLPNN, RBNN, SOMNN and MLR models for estimation of monthly reference evapotranspiration at Pantnagar and Ranichauri stations.

The performance of the MLR model was found to be worst at Pantnagar station, while at Ranichauri station, the performance of MLR model was better than the RBNN and SOMNN models.

Due to the changing climatic environment, this study confirmed that at low altitude (Pantnagar) only three climatic variables (i.e., T

_{max}, U_{s}and R_{s}) are required for estimating the monthly ET_{o}, while at high altitude (Ranichauri), five climatic variables (i.e., T_{min}, T_{max}, RH, U_{s}, and R_{s}) are required for estimating the monthly ET_{o.}The proposed CANFIS models guide irrigation engineers and agriculturists towards better estimation of monthly reference evapotranspiration at study stations in light of data availability.

The results of CANFIS models would help local stakeholders in terms of irrigation scheduling, and planning and management of water resources.

Since this study focuses on a specific area of India (i.e., Pantnagar and Ranichauri stations), the results from this research cannot generalize the capability and accuracy of applied models for other climatic zones in the world. Thus, it is recommended that areal extension (e.g., multi-case study including other climatic conditions) can confirm the generalization of applied models. Therefore, these approaches can be accomplished based on spatial-temporal scales including different climatic zones. Furthermore, the various percentages of training and testing datasets for different years should be considered for better predictability of data-driven models for future studies. The obtained results of this study may be compared with other machine learning (e.g., simple and hybrid approaches) and empirical models.