Abstract
Reference evapotranspiration (ETo) is a major component of the hydrological cycle linking the irrigation water requirement and planning and management of water resources. In this research, the potential of co-active neuro-fuzzy inference system (CANFIS) was investigated against the multilayer perceptron neural network (MLPNN), radial basis neural network (RBNN), self-organizing map neural network (SOMNN) and multiple linear regression (MLR) to estimate the monthly ETo at Pantnagar and Ranichauri stations, located in the foothills of Indian central Himalayas of Uttarakhand State, India. The significant combination of input variables for implemented techniques was decided by the Gamma test (GT). The results obtained by CANFIS models were compared with MLPNN, RBNN, SOMNN and MLR models based on performance evaluation indicators and visual inspection using line, scatter and Taylor plots for both the stations. The results of comparison revealed that CANFIS-5/CANFIS-9 models (RMSE = 0.0978/0.1394, SI = 0.0261/0.0475, COE = 0.9963/0.9846, PCC = 0.9982/0.9942 and WI = 0.9991/0.9959) with three and five input variables provide superior results for estimating monthly ETo at Pantnagar and Ranichauri stations, respectively. Also, the adopted modelling strategy can build a truthful expert intelligent system for estimating the monthly ETo at the study stations.
INTRODUCTION
Estimation of reference evapotranspiration (ETo) is essential for hydrologic water balance, irrigation system design and management, crop yield simulation, and planning and management of water resources (Sentelhas et al. 2010). Normally, ETo is measured directly using lysimeters or also by eddy covariance, while indirectly calculated using the meteorological data. However, measurement of ETo by lysimeters is more time-consuming and needs precise and accurate planned experiments (Lopez-Urrea et al. 2006). The Food and Agriculture Organization (FAO-56) standardized Penman–Monteith (PM) equation was developed for estimating ETo, by referring a hypothetical crop with an assumed height of 0.12 m, surface resistance of 70 s·m−1 and an albedo of 0.23 (Allen et al. 1998). FAO-56 PM equation requires numerous meteorological variables for effective application which may be unavailable or missing in some locations, especially in developing countries. Therefore, alternative approaches that require less meteorological inputs are needed (Kumar et al. 2002; Yassin et al. 2016; Kisi & Alizamir 2018).
In recent years, soft computing techniques have been successfully applied for modelling reference evapotranspiration using various available meteorological variables (Sudheer et al. 2003; Kisi 2007; Zanetti et al. 2007; Kim & Kim 2008; Kumar et al. 2008; Rahimikhoob 2010; Tabari & Talaee 2012; Shiri et al. 2013, 2019; Falamarzi et al. 2014; Patil & Deka 2015; Kisi & Demir 2016; Banda et al. 2017; Gavili et al. 2018; Karbasi 2018; Kisi & Alizamir 2018; Tao et al. 2018; Shiri 2018, 2019a, 2019b; Granata 2019). Kisi et al. (2015) examined the comparative potential of multilayer perceptron artificial neural networks (MLP-ANN), adaptive neuro-fuzzy inference system (ANFIS) with grid partition (ANFIS-GP), ANFIS with subtractive clustering (ANFIS-SC) and genetic expression programming (GEP) in predicting monthly reference evapotranspiration from 50 stations located in Iran. They concluded that the ANFIS-GP models performed better than the other models. Mattar & Alazba (2018) employed GEP and MLR techniques in modelling monthly reference evapotranspiration from 27 stations located in Egypt and found that the GEP models are more powerful models. Pour et al. (2018) used support vector machine (SVM), ANFIS and GEP approaches for modelling monthly potential evapotranspiration in Iran and found that the SVM models performed better than the ANFIS and GEP models. Sanikhani et al. (2018a) investigated the comparative potential of MLP, generalized regression neural network (GRNN), RBNN, ANFIS-GP, ANFIS-SC, GEP and Hargreaves–Samani (HS) techniques in modelling monthly reference evapotranspiration at Antalya and Isparta stations located in Turkey. The results of comparison showed that the GEP and GRNN models estimated successfully at Antalya station, while the RBNN and ANFIS-SC models performed better than the other models at Isparta station. Granata (2019) estimated daily ETo in central Florida, USA, using M5P regression tree, bagging, random forest (RF) and support vector regression (SVR) models. He found that the M5P regression tree model with inputs net solar radiation, sensible heat flux, moisture content of the soil, wind speed, mean relative humidity and mean temperature performed superior to the other models for daily ETo estimation. Shiri (2019a, 2019b) compared the performance of gene expression programming (GEP), against temperature-based (HS), radiation/humidity-based (Priestley–Taylor, Makkink and Turc), and mass-transfer based (Dalton, Trabert and Penman) equations in modelling monthly ETo at five meteorological locations of Iran. The results of analysis revealed that the GEP models (mass transfer-based) performed better than the other models.
Furthermore, the number of relevant applications of soft computing and time series methods have been found in modelling various hydrological processes, such as river flow forecasting (Yaseen et al. 2015a, 2018a, 2019; Papacharalampous et al. 2019); forecasting air temperature (Sanikhani et al. 2018c); soil temperature estimation (Sanikhani et al. 2018b); stream-flow forecasting (Yaseen et al. 2015b, 2016a, 2016b, 2017; Granata et al. 2018); suspended sediment load estimation (Khosravi et al. 2018; Kisi & Yaseen 2019); rainfall pattern forecasting (Yaseen et al. 2018b); flood forecasting (Solomatine & Xue 2004; Singh et al. 2010); soil moisture estimation (Ahmad et al. 2010); rainfall–runoff modelling (Granata et al. 2016); and predicting monthly temperature and precipitation using stochastic models (Papacharalampous et al. 2018).
This research utilized the Gamma test (GT) for selecting the significant combination of input variables in modelling monthly ETo at study stations. In the recent decade, on global scale, several studies have been conducted using GT in diverse fields such as modelling daily pan-evaporation (Moghaddamnia et al. 2008; Piri et al. 2009; Goyal et al. 2014; Malik et al. 2017b, 2018; Ashrafzadeh et al. 2018); rainfall–runoff and river flow modelling (Remesan et al. 2009; Malik & Kumar 2018; Singh et al. 2018); and suspended sediment load (SSL) modelling (Kakaei-Lafdani et al. 2013; Malik et al. 2017a). Moghaddamnia et al. (2008) decided on an optimal input variables combination for ANN and ANFIS approaches using GT for modelling daily evaporation in an arid and semi-arid region of Iran. They reported the high ability of the ANN and ANFIS models in daily evaporation estimation. Kakaei-Lafdani et al. (2013) utilized the GT to select the most influential inputs for ANN and SVM for SSL estimation in Doiraj River, Iran and found that the ANN and SVM models perform better with the selected inputs using GT. Malik et al. (2017a) applied GT for selecting the appropriate input variables combination for CANFIS, MLPNN, MLR and multiple non-linear regression (MNLR) techniques in simulating daily SSL in Pranhita river basin, India. They found that the CANFIS model performs better than the other models using its selected input combination by the GT. Malik et al. (2018) applied RBNN, SOMNN, MLR, Penman, Stephens-Stewart, Griffiths, Christiansen, Priestley–Taylor and Jensen–Burman–Allen approaches for estimation of daily evaporation at Pantnagar, India. The significant combination of input variables for RBNN, SOMNN and MLR was selected using the GT. They found the RBNN model performed better than the other models.
So far, no literature has been found that utilized the GT for identifying the optimal input combination for CANFIS, MLPNN, RBNN, SOMNN and MLR in monthly reference evapotranspiration estimation. In view of the above-mentioned literature, this study was conducted with the following specific objectives: (i) to decide the significant input variable combination for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches by utilizing the GT; (ii) to compare the accuracy of CANFIS, models with MLPNN, RBNN, SOMNN and MLR models based on various performance indicators and visual inspection for ETo estimation at Pantnagar and Ranichauri stations.
MATERIALS AND METHODS
Study area and data acquisition
The study was conducted at Pantnagar (29° 0′ 0″ N latitude, 79° 38′ 0″ E longitude) with an altitude of 243.8 m above mean sea level (msl) and Ranichauri (30° 18′ 40″ N latitude, 78° 24′ 35″ E longitude) with an altitude of 2,000 m above msl (Figure 1). Both stations are located in the foothills of Indian central Himalayan region of Uttarakhand State, India. The average annual rainfall at Pantnagar station is approximately 1,400 mm, while at Ranichauri station it is approximately 1,176 mm. The monthly meteorological data of minimum and maximum air temperatures (Tmin and Tmax), wind speed (Us), relative humidity (RH), and solar radiation (Rs), were collected from Crop Research Centre (CRC) of Pantnagar and Ranichauri stations located in Uttarakhand State, India. The available data of 32 years (January, 1985–December, 2016) for Pantnagar station and 19 years (January, 1994–December, 2012) for Ranichauri station are presented graphically in Figure 2 using a box and whisker plot. The box and whisker plot provides the information about minimum, first quartile, median, third quartile and maximum values (start interpretation from bottom to top) of meteorological variables.
(a) and (b) Box-Whisker plot of meteorological variables at study stations.
FAO-56 Penman–Monteith equation



Gamma test


In this study, the significant combination of input variables for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches was selected based on the minimum value of Г, SE, G and Vratio for reference evapotranspiration estimation at Pantnagar and Ranichauri stations.
Co-active neuro-fuzzy inference system
Jang et al. (1997) introduced the concept of co-active neuro-fuzzy inference system (CANFIS). The architecture of CANFIS is composed of five layers, fuzzification layer (fuzzy membership), rule layer (and multiplication), normalization layer, defuzzification layer (consequent) and summation layer (fuzzy association), and each input is processed through these five layers (Figure 3). The functioning of each layer and detailed information is given by Aytek (2009) and Tabari et al. (2012a).











Each node output represents the firing strength (weights) of a rule. T-norm operators that perform generalized AND can be used as the node function in this layer.
For convenience, the outputs of this layer are called normalized firing strengths.
In this research, the network of CANFIS was constructed using two to three Gaussian membership functions (MFs), Sugeno fuzzy model (Takagi & Sugeno 1985), hyperbolic tangent activation function (ranges from −1 to 1) for data normalization, and delta-bar-delta (DBD) learning algorithm with the threshold of 0.001 and 1,000 iterations in NeuroSolutions software.
Multilayer perceptron neural network
Haykin (1998) first introduced multilayer perceptron neural network (MLPNN), which consists of layers of parallel processing elements (called neurons), with each layer being fully connected to the preceding layer by interconnection strengths or weights (W). Figure 4 illustrates a feed-forward MLPNN, consisting of an input layer (i), hidden layer (j) and output layer (k), with the inter-connection weights ( and
) between layers of neurons. One or more than one hidden layer exists between an input and an output layer. The number of hidden layers and neurons are specified by the problem to which the network is applied (i.e., the number of predictors and predictands, respectively). The hydrologist must specify the number of hidden layers and neurons for accurate mapping of all the training dataset. Initial estimated weight values are progressively corrected during a training process that compares predicted outputs to known outputs and backpropagates any errors (from right to left in Figure 4) to determine the appropriate weight adjustments necessary to minimize the errors (Dawson & Wilby 2001).




Radial basis neural network
The radial basis neural network (RBNN) is a branch of ANN, and the concept of RBNN was given by Bishop (1995). The radial basis functions (RBF) of the RBNN serve as the activation function. The architecture of RBNN is composed of three layers (input layer, hidden layer and output layer). The functioning of the hidden layer is to create clusters and then outputs are multiplied by the weights; these multiplied weights are incorporated into the hidden layer using the multivariate interpolation functions. In this study, different quantities of hidden layer neurons and spread constants were analysed using a trial-and-error method in NeuroSolutions software. The functioning of each layer and more insightful details of RBNN are given by Bishop (1995), Kisi (2009) and Banda et al. (2017).
Self-organizing map neural network
Kohonen (1982) discovered the self-organizing map neural network (SOMNN) algorithm for effective and clustering analysis. The structure of SOMNN is composed of three steps, (i) initialization: in this step, any random value of initial weights is chosen; (ii) winner finding: this step involves finding the optimal weight vectors with the neuron (winner weights); (iii) weight updating: this step involves adjustment of the winner weights and neighbourhood neurons towards the input vectors. More detailed and theoretical insightful information about SOMNN is given by Chang et al. (2007, 2010). In this study, the size of SOMNN, i.e., rows × columns matrix, neighbourhood shape, hidden layer, processing elements, activation function and learning algorithm were decided using a trial-and-error procedure in NeuroSolutions software, to train and test the obtained networks based on training and testing datasets, respectively.
Multiple linear regression



Model performance evaluation indicators






RESULTS
Deciding optimal input variables using GT
The descriptive statistics of training and testing datasets are provided in Table 1 for the Pantnagar and Ranichauri stations. From Table 1, it was observed that the Pantnagar station has higher temperatures compared to Ranichauri station. Tmin, Tmax and RH have negative skewness while Us, Rs and ETo show positive skewness in training and testing datasets of both stations. The solar radiation has the least skewness at Pantnagar station and it follows a normal distribution. The kurtosis was generally found to be platykurtic (−) except the Us data which are leptokurtic (+) in nature in training datasets for both the stations. The cross-correlation between input and output variables are given in Table 2 for Pantnagar and Ranichauri stations. As observed from Table 2 for Pantnagar and Ranichauri stations, significant correlation was found between Tmin and Tmax, Us, Rs, and ETo; Tmax and RH, Us and ETo; RH and Us, Rs and ETo; Us and Rs and ETo at 5% significance level, respectively.
Descriptive statistics of the meteorological variables for training and testing periods
Statistical parameters . | Monthly meteorological variables with unit . | |||||
---|---|---|---|---|---|---|
Tmin (°C) . | Tmax (°C) . | RH (%) . | Us (m/s) . | Rs (MJ/m2) . | ETo (mm) . | |
Pantnagar (Training dataset: January 1, 1985–December 31, 2010) | ||||||
Minimum | 4.30 | 14.50 | 36.00 | 0.20 | 8.60 | 1.14 |
Maximum | 26.70 | 40.00 | 89.00 | 4.00 | 25.60 | 8.26 |
Mean | 16.82 | 29.70 | 67.38 | 1.40 | 17.62 | 3.90 |
SD | 7.10 | 5.59 | 12.19 | 0.70 | 4.08 | 1.71 |
Skewness | −0.15 | −0.39 | −0.61 | 0.78 | 0.02 | 0.53 |
Kurtosis | −1.53 | −0.71 | −0.39 | 0.37 | −0.77 | −0.57 |
Pantnagar (Testing dataset: January 1, 2011–December 31, 2016) | ||||||
Minimum | 5.80 | 17.00 | 42.00 | 0.60 | 8.20 | 1.23 |
Maximum | 26.30 | 40.10 | 87.00 | 2.80 | 24.70 | 7.63 |
Mean | 17.27 | 29.77 | 68.71 | 1.38 | 16.38 | 3.74 |
SD | 7.08 | 5.80 | 10.91 | 0.49 | 4.40 | 1.62 |
Skewness | −0.13 | −0.51 | −0.78 | 0.36 | −0.05 | 0.43 |
Kurtosis | −1.57 | −0.54 | −0.29 | −0.03 | −0.82 | −0.55 |
Ranichauri (Training dataset: January 1, 1994–December 31, 2008) | ||||||
Minimum | 0.50 | 8.50 | 29.00 | 1.00 | 9.60 | 1.23 |
Maximum | 17.80 | 28.50 | 94.00 | 1.80 | 25.40 | 5.33 |
Mean | 9.87 | 19.55 | 70.43 | 1.42 | 16.37 | 2.93 |
SD | 5.32 | 4.83 | 14.04 | 0.15 | 4.15 | 1.06 |
Skewness | −0.16 | −0.50 | −0.15 | 0.13 | 0.44 | 0.42 |
Kurtosis | −1.37 | −0.90 | −0.69 | 0.11 | −0.73 | −0.70 |
Ranichauri (Testing dataset: January 1, 2009–December 31, 2012) | ||||||
Minimum | 0.50 | 9.30 | 37.00 | 0.40 | 10.50 | 1.13 |
Maximum | 16.70 | 29.10 | 93.00 | 1.60 | 24.60 | 5.42 |
Mean | 9.83 | 20.16 | 67.38 | 1.05 | 16.38 | 2.93 |
SD | 5.34 | 4.67 | 15.41 | 0.40 | 4.09 | 1.13 |
Skewness | −0.20 | −0.36 | 0.16 | −0.37 | 0.57 | 0.59 |
Kurtosis | −1.43 | −0.54 | −0.89 | −1.28 | −0.74 | −0.48 |
Statistical parameters . | Monthly meteorological variables with unit . | |||||
---|---|---|---|---|---|---|
Tmin (°C) . | Tmax (°C) . | RH (%) . | Us (m/s) . | Rs (MJ/m2) . | ETo (mm) . | |
Pantnagar (Training dataset: January 1, 1985–December 31, 2010) | ||||||
Minimum | 4.30 | 14.50 | 36.00 | 0.20 | 8.60 | 1.14 |
Maximum | 26.70 | 40.00 | 89.00 | 4.00 | 25.60 | 8.26 |
Mean | 16.82 | 29.70 | 67.38 | 1.40 | 17.62 | 3.90 |
SD | 7.10 | 5.59 | 12.19 | 0.70 | 4.08 | 1.71 |
Skewness | −0.15 | −0.39 | −0.61 | 0.78 | 0.02 | 0.53 |
Kurtosis | −1.53 | −0.71 | −0.39 | 0.37 | −0.77 | −0.57 |
Pantnagar (Testing dataset: January 1, 2011–December 31, 2016) | ||||||
Minimum | 5.80 | 17.00 | 42.00 | 0.60 | 8.20 | 1.23 |
Maximum | 26.30 | 40.10 | 87.00 | 2.80 | 24.70 | 7.63 |
Mean | 17.27 | 29.77 | 68.71 | 1.38 | 16.38 | 3.74 |
SD | 7.08 | 5.80 | 10.91 | 0.49 | 4.40 | 1.62 |
Skewness | −0.13 | −0.51 | −0.78 | 0.36 | −0.05 | 0.43 |
Kurtosis | −1.57 | −0.54 | −0.29 | −0.03 | −0.82 | −0.55 |
Ranichauri (Training dataset: January 1, 1994–December 31, 2008) | ||||||
Minimum | 0.50 | 8.50 | 29.00 | 1.00 | 9.60 | 1.23 |
Maximum | 17.80 | 28.50 | 94.00 | 1.80 | 25.40 | 5.33 |
Mean | 9.87 | 19.55 | 70.43 | 1.42 | 16.37 | 2.93 |
SD | 5.32 | 4.83 | 14.04 | 0.15 | 4.15 | 1.06 |
Skewness | −0.16 | −0.50 | −0.15 | 0.13 | 0.44 | 0.42 |
Kurtosis | −1.37 | −0.90 | −0.69 | 0.11 | −0.73 | −0.70 |
Ranichauri (Testing dataset: January 1, 2009–December 31, 2012) | ||||||
Minimum | 0.50 | 9.30 | 37.00 | 0.40 | 10.50 | 1.13 |
Maximum | 16.70 | 29.10 | 93.00 | 1.60 | 24.60 | 5.42 |
Mean | 9.83 | 20.16 | 67.38 | 1.05 | 16.38 | 2.93 |
SD | 5.34 | 4.67 | 15.41 | 0.40 | 4.09 | 1.13 |
Skewness | −0.20 | −0.36 | 0.16 | −0.37 | 0.57 | 0.59 |
Kurtosis | −1.43 | −0.54 | −0.89 | −1.28 | −0.74 | −0.48 |
Notes: Tmin and Tmax, minimum and maximum air temperatures (°C); RH, relative humidity (%); Us, wind speed (m/s); Rs, solar radiation (MJ/m2); ETo, reference evapotranspiration (mm); SD, standard deviation.
Cross-correlation between the meteorological variables
Station/Variables . | Tmin (°C) . | Tmax (°C) . | RH (%) . | Us (m/s) . | Rs (MJ/m2) . | ETo (mm) . |
---|---|---|---|---|---|---|
Pantnagar | ||||||
Tmin (°C) | 1.0 | |||||
Tmax (°C) | 0.834* | 1.0 | ||||
RH (%) | 0.041 | −0.468* | 1.0 | |||
Us (m/s) | 0.476* | 0.544* | −0.396* | 1.0 | ||
Rs (MJ/m2) | 0.621* | 0.883* | −0.648* | 0.606* | 1.0 | |
ETo (mm) | 0.719* | 0.919* | −0.610* | 0.764* | 0.933* | 1.0 |
Ranichauri | ||||||
Tmin (°C) | 1.0 | |||||
Tmax (°C) | 0.862* | 1.0 | ||||
RH (%) | −0.114 | −0.489* | 1.0 | |||
Us (m/s) | 0.407* | 0.456* | −0.320* | 1.0 | ||
Rs (MJ/m2) | 0.727* | 0.883* | −0.619* | 0.525* | 1.0 | |
ETo (mm) | 0.918* | 0.934* | −0.359* | 0.507* | 0.921* | 1.0 |
Station/Variables . | Tmin (°C) . | Tmax (°C) . | RH (%) . | Us (m/s) . | Rs (MJ/m2) . | ETo (mm) . |
---|---|---|---|---|---|---|
Pantnagar | ||||||
Tmin (°C) | 1.0 | |||||
Tmax (°C) | 0.834* | 1.0 | ||||
RH (%) | 0.041 | −0.468* | 1.0 | |||
Us (m/s) | 0.476* | 0.544* | −0.396* | 1.0 | ||
Rs (MJ/m2) | 0.621* | 0.883* | −0.648* | 0.606* | 1.0 | |
ETo (mm) | 0.719* | 0.919* | −0.610* | 0.764* | 0.933* | 1.0 |
Ranichauri | ||||||
Tmin (°C) | 1.0 | |||||
Tmax (°C) | 0.862* | 1.0 | ||||
RH (%) | −0.114 | −0.489* | 1.0 | |||
Us (m/s) | 0.407* | 0.456* | −0.320* | 1.0 | ||
Rs (MJ/m2) | 0.727* | 0.883* | −0.619* | 0.525* | 1.0 | |
ETo (mm) | 0.918* | 0.934* | −0.359* | 0.507* | 0.921* | 1.0 |
*Statistically significant correlation at 5% level of significance.
Deciding significant model inputs is a tedious procedure, especially for the non-linear hydrological processes. In this study, the nine combinations of available inputs were considered to evaluate their effects on monthly ETo (Table 3). The GT test was applied to the full dataset for identifying the significant combination of input variable for CANFIS, MLPNN, RBNN, SOMNN and MLR approaches. The test statistics of the GT were reported in Table 4, which revealed minimum value of Г = 0.0240, G = 0.0062, SE = 0.0044, Vratio = 0.0084 with the mask 01011 for Pantnagar station (Figure 5(a)), and Г = −0.0022, G = 0.0039, SE = 0.0011, Vratio = −0.0018 with the mask 11111 for Ranichauri station (Figure 5(b)). The term mask defines the number of effective input variables used, for instance, the mask 01011 and 11111 indicates three and five input variables are used at a time to give an output, i.e., ETo at Pantnagar and Ranichauri stations, respectively. Therefore, the appropriate input combination of three variables (Tmax, Us, Rs) was used for the CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models at Pantnagar station, while five variables (Tmin, Tmax, RH, Us, Rs) were used for CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models at Ranichauri station to estimate monthly ETo.
Input combinations for CANFIS, MLPNN, RBNN, SOMNN and MLR at study station
Climatic variables . | CANFIS/MLPNN/RBNN/SOMNN/MLR . | ||||||||
---|---|---|---|---|---|---|---|---|---|
1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | |
Tmin | ✓ | ✓ | |||||||
Tmax | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
RH | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
Us | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||
Rs | ✓ | ✓ | ✓ | ✓ | ✓ |
Climatic variables . | CANFIS/MLPNN/RBNN/SOMNN/MLR . | ||||||||
---|---|---|---|---|---|---|---|---|---|
1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . | 8 . | 9 . | |
Tmin | ✓ | ✓ | |||||||
Tmax | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
RH | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
Us | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||
Rs | ✓ | ✓ | ✓ | ✓ | ✓ |
Gamma test results on various input combinations
Various input combinations . | Parameters . | ||||
---|---|---|---|---|---|
Γ . | G . | SE . | Vratio . | Mask . | |
Pantnagar | |||||
Tmax | 0.2011 | 0.0568 | 0.0083 | 0.0702 | 01000 |
Tmax, Us | 0.0609 | 0.1514 | 0.0047 | 0.0212 | 01010 |
Tmax, Rs | 0.1185 | 0.0067 | 0.0096 | 0.0414 | 01001 |
Tmax, RH, Us | 0.0847 | 0.0305 | 0.0099 | 0.0296 | 01110 |
Tmax, Us, Rs | 0.0240 | 0.0062 | 0.0044 | 0.0084 | 01011 |
Tmax, RH, Rs | 0.0909 | 0.0103 | 0.0168 | 0.0318 | 01101 |
Tmin, Tmax, RH, Us | 0.0855 | 0.0118 | 0.0080 | 0.0298 | 11110 |
Tmax, RH, Us, Rs | 0.0454 | 0.0147 | 0.0106 | 0.0158 | 01111 |
Tmin, Tmax, RH, Us, Rs | 0.0492 | 0.0451 | 0.0136 | 0.0172 | 11111 |
Ranichauri | |||||
Tmax | 0.1375 | 0.0402 | 0.0099 | 0.1185 | 01000 |
Tmax, Us | 0.1204 | 0.0468 | 0.0058 | 0.1038 | 01010 |
Tmax, Rs | 0.0097 | 0.0086 | 0.0031 | 0.0084 | 01001 |
Tmax, RH, Us | 0.0288 | 0.0149 | 0.0147 | 0.0248 | 01110 |
Tmax, Us, Rs | 0.0065 | 0.0097 | 0.0012 | 0.0056 | 01011 |
Tmax, RH, Rs | 0.0004 | 0.0056 | 0.0046 | 0.0004 | 01101 |
Tmin, Tmax, RH, Us | 0.0287 | 0.0058 | 0.0061 | 0.0247 | 11110 |
Tmax, RH, Us, Rs | −0.0004 | 0.0056 | 0.0042 | −0.0003 | 01111 |
Tmin, Tmax, RH, Us, Rs | −0.0022 | 0.0039 | 0.0011 | −0.0018 | 11111 |
Various input combinations . | Parameters . | ||||
---|---|---|---|---|---|
Γ . | G . | SE . | Vratio . | Mask . | |
Pantnagar | |||||
Tmax | 0.2011 | 0.0568 | 0.0083 | 0.0702 | 01000 |
Tmax, Us | 0.0609 | 0.1514 | 0.0047 | 0.0212 | 01010 |
Tmax, Rs | 0.1185 | 0.0067 | 0.0096 | 0.0414 | 01001 |
Tmax, RH, Us | 0.0847 | 0.0305 | 0.0099 | 0.0296 | 01110 |
Tmax, Us, Rs | 0.0240 | 0.0062 | 0.0044 | 0.0084 | 01011 |
Tmax, RH, Rs | 0.0909 | 0.0103 | 0.0168 | 0.0318 | 01101 |
Tmin, Tmax, RH, Us | 0.0855 | 0.0118 | 0.0080 | 0.0298 | 11110 |
Tmax, RH, Us, Rs | 0.0454 | 0.0147 | 0.0106 | 0.0158 | 01111 |
Tmin, Tmax, RH, Us, Rs | 0.0492 | 0.0451 | 0.0136 | 0.0172 | 11111 |
Ranichauri | |||||
Tmax | 0.1375 | 0.0402 | 0.0099 | 0.1185 | 01000 |
Tmax, Us | 0.1204 | 0.0468 | 0.0058 | 0.1038 | 01010 |
Tmax, Rs | 0.0097 | 0.0086 | 0.0031 | 0.0084 | 01001 |
Tmax, RH, Us | 0.0288 | 0.0149 | 0.0147 | 0.0248 | 01110 |
Tmax, Us, Rs | 0.0065 | 0.0097 | 0.0012 | 0.0056 | 01011 |
Tmax, RH, Rs | 0.0004 | 0.0056 | 0.0046 | 0.0004 | 01101 |
Tmin, Tmax, RH, Us | 0.0287 | 0.0058 | 0.0061 | 0.0247 | 11110 |
Tmax, RH, Us, Rs | −0.0004 | 0.0056 | 0.0042 | −0.0003 | 01111 |
Tmin, Tmax, RH, Us, Rs | −0.0022 | 0.0039 | 0.0011 | −0.0018 | 11111 |
Notes: Г, gamma; G, gradient; SE, standard error.
RMSE, SI, COE, PCC and WI values during testing period of CANFIS, MLPNN, RBNN, SOMNN and MLR models at study stations
Station/Model . | Testing period . | ||||
---|---|---|---|---|---|
RMSE (mm/month) . | SI . | COE . | PCC . | WI . | |
Pantnagar | |||||
CANFIS-5 | 0.0978 | 0.0261 | 0.9963 | 0.9982 | 0.9991 |
MLPNN-5 | 0.1082 | 0.0289 | 0.9955 | 0.9980 | 0.9989 |
RBNN-5 | 0.1972 | 0.0527 | 0.9850 | 0.9925 | 0.9962 |
SOMNN-5 | 0.1487 | 0.0397 | 0.9915 | 0.9960 | 0.9978 |
MLR-5 | 0.2445 | 0.0653 | 0.9770 | 0.9892 | 0.9943 |
Ranichauri | |||||
CANFIS-9 | 0.1394 | 0.0475 | 0.9846 | 0.9942 | 0.9959 |
MLPNN-5 | 0.1428 | 0.0487 | 0.9838 | 0.9936 | 0.9958 |
RBNN-9 | 0.3179 | 0.1085 | 0.9198 | 0.9665 | 0.9790 |
SOMNN-9 | 0.4616 | 0.1575 | 0.8310 | 0.9151 | 0.9495 |
MLR-9 | 0.1967 | 0.0671 | 0.9693 | 0.9887 | 0.9918 |
Station/Model . | Testing period . | ||||
---|---|---|---|---|---|
RMSE (mm/month) . | SI . | COE . | PCC . | WI . | |
Pantnagar | |||||
CANFIS-5 | 0.0978 | 0.0261 | 0.9963 | 0.9982 | 0.9991 |
MLPNN-5 | 0.1082 | 0.0289 | 0.9955 | 0.9980 | 0.9989 |
RBNN-5 | 0.1972 | 0.0527 | 0.9850 | 0.9925 | 0.9962 |
SOMNN-5 | 0.1487 | 0.0397 | 0.9915 | 0.9960 | 0.9978 |
MLR-5 | 0.2445 | 0.0653 | 0.9770 | 0.9892 | 0.9943 |
Ranichauri | |||||
CANFIS-9 | 0.1394 | 0.0475 | 0.9846 | 0.9942 | 0.9959 |
MLPNN-5 | 0.1428 | 0.0487 | 0.9838 | 0.9936 | 0.9958 |
RBNN-9 | 0.3179 | 0.1085 | 0.9198 | 0.9665 | 0.9790 |
SOMNN-9 | 0.4616 | 0.1575 | 0.8310 | 0.9151 | 0.9495 |
MLR-9 | 0.1967 | 0.0671 | 0.9693 | 0.9887 | 0.9918 |
(a) and (b) Graphical presentation of GT statistics for different models at study stations.
(a) and (b) Graphical presentation of GT statistics for different models at study stations.
Monthly ETo estimation at Pantnagar station
Graphical representation of temporal variation between observed and estimated monthly reference evapotranspiration values by (a) CANFIS-5; (b) MLPNN-5; (c) RBNN-5; (d) SOMNN-5; and (e) MLR-5 models during testing period at Pantnagar station.
Graphical representation of temporal variation between observed and estimated monthly reference evapotranspiration values by (a) CANFIS-5; (b) MLPNN-5; (c) RBNN-5; (d) SOMNN-5; and (e) MLR-5 models during testing period at Pantnagar station.
Using Equation (23), the relative error percentage between observed and estimated monthly ETo values of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during the testing period was calculated and presented graphically in Figure 7. As Figure 7 demonstrates, the RE percentage concentrated around the ±10% upper band (UB) and lower band (LB) for 100% (CANFIS-5), 98.61% (MLPNN-5), 81.94% (RBNN-5), 97.22% (SOMNN-5) and 81.94% (MLR-5) of testing dataset. Here, the CANFIS-5 model shows less percentage of RE as compared to the MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models in the testing dataset. It was also noticed that the maximum distribution of RE for all given models appeared in the peak value of monthly ETo.
The RE (%) distribution of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during testing period at Pantnagar station.
The RE (%) distribution of CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during testing period at Pantnagar station.
The spatial pattern of estimated and observed values' monthly ETo by CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during the testing period was also evaluated by using the Taylor diagram (TD). Taylor (2001) provided a polar plot for acquiring a visual judgement of model performance. It has the ability to emphasize the accuracy and efficiency of models based on the observed values. The Taylor diagram exhibits three specific statistics (i.e., correlation coefficient, normalized standard deviation and RMSE). Figure 8 provides the Taylor diagram for observed and estimated values using the applied models at Pantnagar station. Figure 8 demonstrated that the CANFIS-5 and MLPNN-5 models provided the lower RMSE, lower standard deviation and higher correlation coefficient than the SOMNN-5, RBNN-5 and MLR-5 models. Hence, the CANFIS-5 model with selected inputs (Tmax, Us, Rs) can be used for modelling monthly ETo at Pantnagar station.
Taylor diagram of observed and estimated monthly reference evapotranspiration by CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during testing period at Pantnagar station.
Taylor diagram of observed and estimated monthly reference evapotranspiration by CANFIS-5, MLPNN-5, RBNN-5, SOMNN-5 and MLR-5 models during testing period at Pantnagar station.
Monthly ETo estimation at Ranichauri station
The temporal variation between observed and estimated monthly ETo values of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period is plotted using line plot (left side) and scatter plot (right side) in Figure 9(a)–9(e), respectively. As observed from the time variation graphs (line and scatter), the estimates of the CANFIS-9 were closer to the observed monthly ETo values (PM FAO-56) compared to other models. It is clear from the scatterplots that the CANFIS-9 has less scattered estimates than the other models. It should be noted that all the models underestimated the peak ETo values. This may be due to the fact that there is insufficient number of peak values in the training and applied methods cannot learn the process. The ranks of the models were the CANFIS-9, MLPNN-9, MLR-9, RBNN-9 and SOMNN-9 for Ranichauri station based on the statistical and visual comparisons, respectively.
Graphical representation of temporal variation between observed and estimated monthly reference evapotranspiration values by (a) CANFIS-9, (b) MLPNN-9, (c) RBNN-9, (d) SOMNN-9, and (e) MLR-9 models during testing period at Ranichauri station.
Graphical representation of temporal variation between observed and estimated monthly reference evapotranspiration values by (a) CANFIS-9, (b) MLPNN-9, (c) RBNN-9, (d) SOMNN-9, and (e) MLR-9 models during testing period at Ranichauri station.
The relative error percentage (Equation (23)) between observed and estimated monthly ETo values of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period is presented in Figure 10. As Figure 10 demonstrates, the RE percentage was between ±10% upper band (UB) and lower band (LB) for 95.83% (CANFIS-9), 93.75% (MLPNN-9), 72.91% (RBNN-9), 75% (SOMNN-9) and 87.5% (MLR-9) of testing dataset. Here, the CANFIS-9 model shows lesser percentage of RE as compared to the MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models in the testing dataset. It also indicated that the maximum distribution of RE for all prescribed models appeared in the peak value of monthly ETo.
The RE (%) distribution of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during testing period at Ranichauri station.
The RE (%) distribution of CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during testing period at Ranichauri station.
The spatial pattern of observed and estimated values of monthly ETo by CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during the testing period was also evaluated by using the Taylor diagram (TD). Figure 11 provides the TD for observed and estimated values using the applied models at Ranichauri station. Figure 11 illustrates that the CANFIS-9 model provided lower RMSE and higher correlation coefficient than the MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models. Hence, the CANFIS-9 model with selected inputs (Tmin, Tmax, RH, Us, Rs) can be successfully used for monthly ETo estimation at Ranichauri station.
Taylor diagram of observed and estimated monthly reference evapotranspiration by CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during testing period at Ranichauri station.
Taylor diagram of observed and estimated monthly reference evapotranspiration by CANFIS-9, MLPNN-9, RBNN-9, SOMNN-9 and MLR-9 models during testing period at Ranichauri station.
DISCUSSION
In this study, the feasibility of a hybrid model, i.e., co-active fuzzy inference system was evaluated for estimating monthly reference evapotranspiration at Pantnagar and Ranichauri stations. The CANFIS model is the integration of an artificial neural network and fuzzy inference system in a single topology. The significant combination of input variables was decided using a non-linear modelling tool, i.e., Gamma test. The estimates yielded by the CANFIS model were compared to the multilayer perceptron neural network, radial basis neural network, self-organizing map neural network and multiple linear regression models using statistical indicators such as root mean squared error, scatter index, COE, PCC, Willmott index, relative error, and visual basis using line plot, scatter plot, relative error plot and Taylor diagram for both the stations under study. The CANFIS model with Gaussian membership functions, Takagi–Sugeno–Kang fuzzy inference system, hyperbolic tangent activation function, and DBD learning algorithm improved substantially the performance of modelling by increasing the COE, PCC and WI and reducing the RMSE, SI and RE measurements at both the stations. The estimation accuracy of the MLPNN-5, RBNN-5, SOMM-5 and MLR-5 models with respect to RMSE was reduced by 10%, 50%, 34% and 60% using the CANFIS-5 models at Pantnagar station, while at Ranichauri station, the estimation accuracy of the MLPNN-9, RBNN-9, SOMM-9 and MLR-9 models with respect to RMSE was decreased by 2%, 56%, 70% and 29% by applying the CANFIS-9 model, respectively. The applied modelling strategy builds a standard and reliable intelligent system that can be used for Pantnagar and Ranichauri stations and is extremely valuable for water resources managers, agriculture and irrigation engineers and agronomists.
Over the past decade, throughout the world, studies have been conducted by researchers on reference evapotranspiration or pan-evaporation estimation using numerous artificial intelligence techniques (hybrid or simple) (Aytek 2009; Cobaner 2011; Karimaldini & Shui 2012; Tabari et al. 2012b; Shiri et al. 2014; Marti et al. 2015; Shrestha & Shukla 2015; Kumar et al. 2016; Ghorbani et al. 2017; Shiri 2017, 2019b; Landeras et al. 2018; Pour et al. 2018; Saggi & Jain 2018). Aytek (2009) applied the CANFIS for modelling daily reference evapotranspiration from three meteorological stations located in the USA. The performance of CANFIS model was compared to the California Irrigation Management Information System (CIMIS) Penman equation, Penman–Monteith equation, the HS equation and the Turc equation based on root mean square error, coefficient of efficiency, adjusted coefficient of efficiency and determination coefficient. The results of analysis revealed that the performance of CANFIS model with four inputs (solar radiation, mean temperature, relative humidity and wind speed) was superior to the other models. Tabari et al. (2012a) utilized the CANFIS and MLP model to predict the daily pan-evaporation in the semi-arid region of Iran. They reported that the CANFIS model outperformed the MLP model for daily pan-evaporation prediction in the study region. Ghorbani et al. (2017) estimated pan-evaporation using hybrid multilayer perceptron-firefly algorithm (MLP-FFA) in the north of Iran. The estimates of MLP-FFA model were compared to the traditional MLP and support vector machine (SVM) models using statistical indicators. They found that the hybrid model (MLP-FFA) performed better than the traditional models (MLP and SVM). Saggi & Jain (2018) examined the performance of deep learning-multilayer perceptrons (DL), generalized linear model (GLM), random forest (RF) and gradient-boosting machine (GBM) in H2O environment in estimating the daily ETo at Hoshiarpur and Patiala districts of Punjab, India. They reported that the DL models outperformed the other models for daily ETo estimation in the study region. Shiri et al. (2019) applied the data splitting strategy to GEP for estimating the daily ETo from five meteorological locations in northwestern Iran, and they found good results of GEP under different scenarios.
The above-reported studies also confirmed the supremacy of hybrid models in the estimation of reference evapotranspiration or pan-evaporation. Thus, this study provided conclusive evidence that the estimation of monthly reference evapotranspiration can be done effectively in order of accuracy by CANFIS-5 > MLPNN-5 >SOMNN-5 > RBNN-5 > MLR-5 models at Pantnagar station, and by CANFIS-9 > MLPNN-9 > MLR-9 > RBNN-9 >SOMNN-9 models at Ranichauri station.
CONCLUSIONS
This study was conducted to examine the potential of CANFIS model against the MLPNN, RBNN, SOMNN and MLR models for estimating the monthly reference evapotranspiration at Pantnagar and Ranichauri stations, situated in foothills of Indian central Himalayan region of Uttarakhand State, India. For estimating monthly reference evapotranspiration at both the stations, the significant combination of input variables for CANFIS, MLPNN, RBNN, SOMNN and MLR models were decided based on the minimum value of gamma statistics (i.e., gamma, gradient, standard error and Vratio) obtained by applying the Gamma test. The following specific conclusions were derived from this study:
The three input variable combination (maximum air temperature, wind speed and solar radiation) was selected as the most appropriate combination based on the GT for estimating monthly reference evapotranspiration at Pantnagar station.
The five input variable combination (minimum and maximum air temperatures, relative humidity, wind speed and solar radiation) was selected as the most appropriate combination based on the GT for estimating monthly reference evapotranspiration at Ranichauri station.
Based on performance indicators and a visual inspection, the performance of CANFIS model was found superior to the MLPNN, RBNN, SOMNN and MLR models for estimation of monthly reference evapotranspiration at Pantnagar and Ranichauri stations.
The performance of the MLR model was found to be worst at Pantnagar station, while at Ranichauri station, the performance of MLR model was better than the RBNN and SOMNN models.
Due to the changing climatic environment, this study confirmed that at low altitude (Pantnagar) only three climatic variables (i.e., Tmax, Us and Rs) are required for estimating the monthly ETo, while at high altitude (Ranichauri), five climatic variables (i.e., Tmin, Tmax, RH, Us, and Rs) are required for estimating the monthly ETo.
The proposed CANFIS models guide irrigation engineers and agriculturists towards better estimation of monthly reference evapotranspiration at study stations in light of data availability.
The results of CANFIS models would help local stakeholders in terms of irrigation scheduling, and planning and management of water resources.
Since this study focuses on a specific area of India (i.e., Pantnagar and Ranichauri stations), the results from this research cannot generalize the capability and accuracy of applied models for other climatic zones in the world. Thus, it is recommended that areal extension (e.g., multi-case study including other climatic conditions) can confirm the generalization of applied models. Therefore, these approaches can be accomplished based on spatial-temporal scales including different climatic zones. Furthermore, the various percentages of training and testing datasets for different years should be considered for better predictability of data-driven models for future studies. The obtained results of this study may be compared with other machine learning (e.g., simple and hybrid approaches) and empirical models.