## Abstract

The Penman–Monteith evapotranspiration (ET) model has superior predictive ability to other methods, but it is challenging to apply in several Indian stations, owing to the need for a large number of climatic variables. The study investigated an artificial neural network (ANN) model for calculating ET for various agro-climatic regions of India. Sensitivity analysis showed that the overall average changes in ET_{0} values for 25% change in the climatic variables were 18, 16, 14, 7, 5, and 4%, respectively, for *T*_{max}, RH_{mean}, *R _{n}*, wind speed,

*T*

_{min}, and sunshine hours. The dominant climatic variables were identified from the principal component analysis (PCA) and ET

_{0}was computed using an ANN with dominant climatic variables. The ANN architecture with backpropagation technique had one hidden layer and neurons ranging from 10 to 30 for all climatic variables and from 5 to 10 for PCA variables. The new ET models were statistically compared with Penman–Monteith ET estimate, and found reliable. PCA variables guaranteed an estimate of ET

_{0}accounting for 98% of the variability. The average values of coefficient of determination, standard error of estimate, and percentage efficiency were observed as 0.96, 0.24, and 94%, respectively.

## HIGHLIGHTS

The Penman–Monteith ET model is the standard but data-intensive, so its applicability is limited.

The crucial climatic variables influencing ET are identified for various agro-climatic regions using principal component analysis and sensitivity analysis.

New ET models are developed and compared with the standard Penman–Monteith ET estimate.

Less data-intensive ANN models are proven to be acceptable in estimating ET

_{0}.

## ABBREVIATIONS

## SYMBOLS

- ET
_{0} potential evapotranspiration (mm/day)

*I*_{m}moisture index, mm

*n/N*ratio of actual to maximum possible duration of sunshine hours

*R*^{2}coefficient of determination

*R*_{n}net radiation at the crop surface (MJm

^{−2}day^{−1})*s*the water surplus, mm

*d*the water deficit, mm

- T
_{mean} average temperature (°C)

*T*_{max}maximum temperature (°C)

*T*_{min}minimum temperature (°C)

*U*wind velocity (km day

^{−1})

## GREEK SYMBOLS

## INTRODUCTION

Computation of evapotranspiration (ET) for various agro-climatic regions is essential for efficient water management. ET is the combined loss of water to the atmosphere due to evaporation from the soil, water surface, and plant transpiration. If water is unlimited, the amount of water lost is determined by atmospheric conditions, and this evaporation power of the atmosphere is described as potential evapotranspiration (ET_{0}). It is challenging to quantify and predict ET, a key factor in models of terrestrial water balance (Joshua *et al.* 2005). Direct measurement of ET is costly and laborious. Simple prediction techniques like Blany–Criddle and complicated prediction techniques like the Penman method have both been developed during the past few decades (Allen *et al.*1986). Penman's method made use of variables like surface aerodynamics, net radiation intensity, and evaporation dynamics. Later, Montieth enhanced this approach by taking into account the plant's daily resistance and created the Penman–Montieth (P–M) equation (Allen *et al.* 1998). The Food and Agricultural Organization (FAO) considers ET calculated using the P–M method as the standard and recommends it for all climate zones (Allen *et al.* 1998). Different ET models were developed and evaluated using the P–M approach as a reference (Kisi 2007; Landeras *et al.* 2008). The optimal ET model is chosen based on factors such as data availability, location, season, climate, purpose, and time period (Samani 2000). Actual ET, according to Kumar *et al.* (2002), is a complicated and nonlinear phenomenon that depends on the interplay of various environmental factors, including crop type and development stage, wind speed, air humidity, and temperature. Lysimeters can be used to measure reference ET, however, this approach is very expensive and difficult to use (Valipour 2014). The heat produced by the earth and the solar radiation that is absorbed by the atmosphere raises the temperature of the air. The rate of ET is controlled by the sensible heat of the surrounding air, which transmits energy to the crop. This variability is more in drier climates (Doorenbos & Pruitt 1977). ET_{0} values estimated by the Hargreaves method were 22% higher than the P–M method in the warm climate of Southern Europe (Trajkovic 2007). The variation in temperature and relative humidity (RH) values is primarily responsible for the differentiation. Similar results were reported by Sentelhas *et al.* (2010) and Kaya *et al.* (2017). Strong spatial and temporal variability of climatic conditions are characteristic of semi-arid environments (Leduc *et al.* 2007).

The artificial neural network (ANN) technology is widely used to solve complex nonlinear interactions in various sciences and has provided many promising results in the field of hydrology and water resources engineering (Ghazvinian *et al.* 2020). The ability to identify a relationship from sufficient data pairs makes it possible for ANNs to solve large-scale complex problems such as pattern recognition and nonlinear modeling (ASCE 2000).

Soft computing models are extensively used in various fields of water resources, such as streamflow simulation, where ANN models performed better than the Hydrologic Engineering Centre – Hydrologic Modeling System (HEC-HMS) (Loyeh & Jamnani 2017); infiltration process, where random forest regression models performed better than an ANN, multi-linear regression, and M5P tree models (Singh *et al.* 2021). Suspended sediment load prediction of a Ganga river stretch with up to 95% accuracy was found possible with the ANN model, with the performance tested by normalized root mean square error, correlation coefficient and Theil's *U* statistics (Gaur *et al.* 2021a). Groundwater levels were modeled using wavelet support vector machine (SVM) and ANN algorithms, and the performance was evaluated using correlation coefficient and Nash–Sutcliffe efficiency index. Wavelet decomposition-based SVR was observed to be superior (Gaur *et al.* 2021b). Podeh *et al.* (2023) carried out calibration of infiltration models using particle swarm optimization. Kia *et al.* (2012) developed a flood model with flood causing factors using ANN to model and simulate flood-prone areas in the southern part of Peninsular Malaysia.

The PCA-based Adaptive Neuro Fuzzy Inference System (ANFIS) developed by Parsaie *et al.* (2018) was designed to estimate the longitudinal dispersion coefficient (*D _{L}*) in rivers and the performance of the developed model was found suitable to predict the

*D*.

_{L}The piezometric head in the core and the seepage discharge through the body of the earth dam were predicted using soft computing models, including the multi-layer perceptron neural network (MLPNN), SVM, multivariate adaptive regression splines (MARS), genetic programming (GP), M5 algorithm, and group method of data handling (GMDH). The results revealed that all models had a tolerable level of accuracy in predicting the piezometric heads, although the MARS model performed the best and the M5 method performed the worst Parsaie *et al.* (2021).

Nezaratian *et al.* (2021) approximated the transverse mixing coefficient (TMC) in streams using SVM based on genetic algorithm (GA) and found that efficient TMC estimate by GA-SVM can reduce the complexity by minimizing the number of input parameters.

The effect of climate change on ET was studied by Chakraborty *et al.* (2018) and Li *et al.* (2020). Balmat *et al.* (2019) computed reference crop evapotranspiration (ET_{0}) in a greenhouse based on an adaptive-network-based fuzzy inference system, to estimate ET_{0} with less information than the standard methods. Ramírez *et al.* (2005) explored the use of ANN in precipitation forecasting for a Brazilian watershed and discovered that predictions were accurate. Dahamsheh & Aksoy (2009) used ANN to forecast monthly precipitation in Jordan's dry regions, and found that ANN performed better than multivariate regression.

The P–M approach is the most effective physical method, while the ANN model is an accurate empirical method (Adamala & Srivastava 2018). The findings of the factor analysis were very helpful in revealing the relative significance of meteorological variables in explaining the variations in ET estimation (Lakshman & Kovoor 2006). Many researchers have used ANN for modeling the complex process of ET with minimum climatological data (Kim & Kim 2008; Traore *et al.* 2010; Diamantopoulou *et al.* 2011). Sudheer *et al.* (2003) adopted three options for input data to estimate ET_{0}, and in the more simplified option, ET_{0} was estimated as a function of average air temperature only. Zanetti *et al.* (2007) obtained satisfactory estimates of reference ET, by ANN, using only the maximum and minimum air temperatures as input for a province in Rio de Janeiro. Salami & Ehteshami (2016) used the ANN model to show the average rate of temperature variation. Genaidy (2020) found satisfactory performance for the ANN model with four inputs such as *T*_{max}, *T*_{min}, dew point data, and wind speed, with P–M ET as the target for a station in the province of Nubaria. Deshmukh (2016) used multiple input combinations to create ANN models for ET estimation in the Nagpur region of India, which has a hot and humid climate, and found that they performed well. For hot climatic conditions, ANN models were created using pan evaporation data by a few researchers (Keskin & Terzi 2006; Khoob 2008).

Since agriculture is the largest consumer of water, developing irrigation systems and managing water supplies depend on accurately estimating the ET process. Underestimation of ET results in plant moisture stress and a reduction in agricultural production, while over estimation wastes water, damages crops and contaminates groundwater and soil. Given the complexity of agricultural systems, which deal with many factors, a nonlinear method is needed to interpret the relationships, which is possible with ANN (Gunathilake *et al.* 2021).

The P–M model's accurate prediction of ET has limited applicability because many of the parameters are not available for several stations. Hence, less data-intensive computational tools, such as ANN models, are highly valuable in precisely estimating ET_{0} for effective agricultural water management, and the climatic parameters for a region should be chosen based on the sensitivity of the parameter for the region. Hence, simpler ANN models can be employed, as they are accurate as the classic P–M model, though they do not require as many parameters.

The P–M approach was used as the target in the current investigation to compute ET_{0}. ANN models were developed for ET_{0} estimation with all the climatic variables and with the crucial climatic variables identified by PCA.

## STUDY AREA AND DATA COLLECTION

*I*indicates the moisture index,

_{m}*s*indicates the water surplus,

*d*indicates the water deficit, and ET

_{0}indicates the potential ET. Table 1 lists the details of the 10 stations and the climatological classification.

Sl. no. . | Station name . | Longitude . | Latitude . | Altitude (m) . | Climatic region . |
---|---|---|---|---|---|

1 | Pattambi | 76°12′ | 10°48′ | 255 | Per-humid |

2 | Dharwad | 75°07′ | 15°26′ | 626 | Humid |

3 | Bangalore | 77°37′ | 13° | 914 | Moist sub-humid |

4 | Kovilpatti | 77°53′ | 9°12′ | 105 | Dry sub-humid |

5 | Rajahmundry | 81°46′ | 17° | 14 | Dry sub-humid (coastal) |

6 | Anakapalle | 83°01′ | 17°38′ | 29 | Dry sub-humid (coastal) |

7 | Annamalai nagar | 79°44′ | 11°24′ | 6 | Dry sub-humid (Coastal) |

8 | Sholapur | 75°54′ | 17°04′ | 479 | Semi-arid |

9 | Bellary | 76°51′ | 15°09′ | 449 | Semi-arid |

10 | Hyderabad | 78°16′ | 17°32′ | 505 | Semi-arid |

Sl. no. . | Station name . | Longitude . | Latitude . | Altitude (m) . | Climatic region . |
---|---|---|---|---|---|

1 | Pattambi | 76°12′ | 10°48′ | 255 | Per-humid |

2 | Dharwad | 75°07′ | 15°26′ | 626 | Humid |

3 | Bangalore | 77°37′ | 13° | 914 | Moist sub-humid |

4 | Kovilpatti | 77°53′ | 9°12′ | 105 | Dry sub-humid |

5 | Rajahmundry | 81°46′ | 17° | 14 | Dry sub-humid (coastal) |

6 | Anakapalle | 83°01′ | 17°38′ | 29 | Dry sub-humid (coastal) |

7 | Annamalai nagar | 79°44′ | 11°24′ | 6 | Dry sub-humid (Coastal) |

8 | Sholapur | 75°54′ | 17°04′ | 479 | Semi-arid |

9 | Bellary | 76°51′ | 15°09′ | 449 | Semi-arid |

10 | Hyderabad | 78°16′ | 17°32′ | 505 | Semi-arid |

The meteorological data were obtained from the India Meteorological Department (IMD), Government of India for all the stations for a period of five years. For each station, the daily climatic data collected included maximum air temperature in °C (*T*_{max}), minimum air temperature in °C (*T*_{min}), maximum relative humidity in % (RH_{max}), minimum relative humidity in % (RH_{min}), actual sunshine hours, 24-h wind speed, rainfall, and pan evaporation depth. Site details such as latitude and longitude of the stations, altitude above mean sea level, and wind speed measurement height were also obtained from the IMD, Pune.

## METHODOLOGY

_{0}for the 10 stations from various climatic zones with all climatic variables and with the climatic variables identified by PCA. The ability of ANN techniques to solve nonlinear systems with fewer inputs is a key advantage over traditional ones. If correctly trained, the ANN approach is excellent for modeling ET since it is quicker to establish the relation than the well-known P–M method, which requires a large amount of data that may not be accessible for many Indian stations. The methodology adopted is provided in Figure 1.

### Estimation of ET by the P–M model

_{0}estimations from simpler ANN models were evaluated and compared with the values produced using the P–M approach which is the most accurate ET estimation. The P–M method is a physically based strategy that takes advantage of the surface's physiological and aerodynamic features. It employs a reference surface, which is a reference crop with 0.12 m crop height, 70 sm

^{−1}constant surface resistance, and 0.23 albedo (Allen

*et al.*1998). The detailed calculation procedure given by Allen

*et al.*(1998) is as follows:where ET

_{0}indicates potential evapotranspiration (mm/day);

*R*indicates net radiation at the crop surface (MJ m

_{n}^{−2}day

^{−1});

*G*indicates soil heat flux density (MJ m

^{−2}day

^{−1});

*T*indicates mean daily air temperature at 2 m height (°C);

*U*

_{2}indicates wind speed at 2 m height (s

^{−1});

*e*indicates saturation vapor pressure (kPa);

_{s}*e*indicates actual vapor pressure (kPa); (

_{a}*e*–

_{s}*e*) indicates saturation vapor pressure deficit (kPa); indicates slope of vapor pressure curve (kPa °C

_{a}^{−1}); indicates psychrometric constant (kPa °C

^{−1}).

### Principal component analysis

Principal component analysis (PCA) is a statistical process for converting a large number of potentially linked variables into a smaller number of uncorrelated variables known as principal components. PCA is a technique used in statistics to simplify a dataset by reducing multidimensional datasets to fewer dimensions for analysis. The first principal component accounts for as much variety as feasible in the data, and each subsequent component accounts for as much variability as possible. Eigen analysis is the mathematical approach utilized in PCA. A square symmetric matrix containing a sum of squares and cross products has its eigen values and eigen vectors solved. The eigen vector associated with the largest eigen value has the same direction as the first principal component. The eigen vector associated with the second largest eigen value determines the direction of the second principal component. PCA can be used for dimensionality reduction in a dataset while retaining those characteristics of the dataset that contribute most to its variance.

### Sensitivity analysis

Sensitivity analysis identifies the parameter(s) that have the biggest influence on the model's output (Omar *et al.* 2021). A sensitivity analysis is the process of varying model input parameters over a reasonable range and observing the relative change in model response. Sensitivity analysis was carried out to find which climatic variables are more sensitive in the ET equation. Sensitivity can be considered as the absolute change in ET_{0} results for a positive or negative change in different variables individually. Each of the climatic variables was varied in turn over a specified range while keeping the other variables constant and its impact on ET_{0} estimate was assessed. Percentage changes in ET_{0} with percentage changes in different variables were plotted to identify the more sensitive parameters. The relative importance of variables on ET_{0} and the impact of estimation errors due to individual variables are also analyzed. The results of sensitivity analysis will also be highly beneficial for determining the direction of future data collection activities.

### Estimation of ET by the ANN model

ANN can be successfully utilized to predict the complex process of ET using minimal climatological data. The neural network learns naturally and incrementally in the course of processing. The learning process is carried out by a learning algorithm, by modifying the weights of interconnections of the network with known inputs and outputs. Later these weights are used to produce the desired output for a given input pattern.

#### Backpropagation algorithm

*x*, the equation iswhere

_{i}*w*indicates inter connecting weight between neuron

_{ij}*j*in the hidden layer and neuron

*i*in the input layer.

The ANN toolbox in MATLAB (The MathWorks Inc. 2003) was used for the study. The neural network procedure used in the present study was a feed forward network type backpropagation algorithm in which the flow of information is from input layer to output layer. Mean square error regression analysis was used to test the agreement between P–M ET_{0} and ANN ET_{0.} Training was done by varying the number of neurons until a good agreement was obtained. After training was over the network was frozen and the validation data were fed into the system to make predictions. The performance of the network during validation was analyzed using various statistical procedures to assess the generalization properties of the trained network.

### Performance evaluation

The performance of the developed models was evaluated using statistics like the coefficient of determination (*R*^{2}), standard error of estimate (SEE), standard deviation (SD), and efficiency values (%). The statistical test of significance used for the study was *z*-test.

R^{2} measures the degree to which two variables are linearly related and a value of *R*^{2} close to unity indicates a high degree of association between the two variables.

*Model efficiency* (%): The difference between the ET estimated by the P–M method and each of the other methods is calculated. If the difference is less than 0.5 mm/day, the estimate is good. The number of occasions on which the difference is less than 0.5 mm/day, expressed as a percentage of the total number of populations, will give the model efficiency (Mohan 1988).

*Test for significance:* The *z*-test was used to test whether there is a significant difference between ET_{0} values estimated by the P–M method and the ET_{0} values estimated by the ANN model. The assumptions used in this test include: samples are independent and random in nature; populations are normally distributed or if not normal, can be approximated by a normal distribution (*n*_{1} ≥ 30 and *n*_{2} ≥ 30).

*z*-test statistics are defined as follows: indicates the ET

_{0}estimate by the P–M method (Sample 1); indicates the ET

_{0}estimate by the developed model (Sample 2);

*n*

_{1}and

*n*

_{2}are sample size 1 and sample size 2, respectively; and are standard deviations of Samples 1 and 2.

## RESULTS AND DISCUSSIONS

### Estimation of ET by the P–M model

The availability of climatic data is an important criterion in selecting an appropriate method. Depending on the method, different datasets are required, ranging from the Pan method, which just requires information on pan evaporation, to the P–M method, which requires a large number of meteorological data. A suitable method that results in a fairly accurate estimation ET_{0} should be found under a data-short environment.

The daily values of ET_{0} were calculated by the P–M method for each of the 10 stations. The average value of estimates, obtained by averaging ET_{0} values across the period of record, varied from a low value of 3.09 mm/day at the coastal area of Anakapalle to a high of 4.58 mm/day at the semi-arid location of Bellary. The annual average ET_{0} values also showed similar trends with a maximum value of ET_{0} for Bellary (1,680 mm) and a minimum value for Anakapalle (1,136 mm) as per P–M estimate.

### Principal component analysis

A study on PCA was performed in order to gain a deeper insight into the relative influence of different meteorological variables on ET_{0} under different climatic conditions and to find the most critical climatic variables for each station considered. Daily data on net radiation (*R _{n}*

_{,}MJ m

^{−2}day

^{−1}), maximum temperature (

*T*

_{max,}°C), minimum temperature (

*T*

_{min,}°C), average RH (%), wind velocity (

*U*, km day

^{−1}), and ratio of sunshine hours (

*n/N*) were considered for PCA. Rotated factor loadings were examined to provide useful quantitative information. In the analysis, the principal components were first identified and then the factors were derived using varimax rotation using SPSS software (Shaffer

*et al.*1999). From the PCA carried out, it was observed that radiation is the most important climatic variable for ET

_{0}estimation, followed by RH/temperature and wind speed, as evident from Table 2.

Sl. no. . | Station name . | Important climatic variables . | Climatic region . |
---|---|---|---|

1 | Pattambi | Temperature, wind, radiation | Per-humid |

2 | Dharwad | Radiation, relative humidity | Humid |

3 | Bangalore | Radiation, relative humidity, wind | Moist Sub-humid |

4 | Kovilpatti | Radiation, relative humidity, wind | Dry Sub-humid |

5 | Rajahmundry | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

6 | Anakapalle | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

7 | Annamalainagar | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

8 | Sholapur | Radiation, relative humidity, wind | Semi-arid |

9 | Bellary | Radiation, relative humidity, wind | Semi-arid |

10 | Hyderabad | Radiation, relative humidity, wind | Semi-arid |

Sl. no. . | Station name . | Important climatic variables . | Climatic region . |
---|---|---|---|

1 | Pattambi | Temperature, wind, radiation | Per-humid |

2 | Dharwad | Radiation, relative humidity | Humid |

3 | Bangalore | Radiation, relative humidity, wind | Moist Sub-humid |

4 | Kovilpatti | Radiation, relative humidity, wind | Dry Sub-humid |

5 | Rajahmundry | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

6 | Anakapalle | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

7 | Annamalainagar | Radiation, relative humidity, wind | Dry Sub-humid (coastal) |

8 | Sholapur | Radiation, relative humidity, wind | Semi-arid |

9 | Bellary | Radiation, relative humidity, wind | Semi-arid |

10 | Hyderabad | Radiation, relative humidity, wind | Semi-arid |

### Sensitivity analysis

Sensitivity analysis was carried out to assess the influence of climatic variables on the ET equation and the percentage changes in ET_{0} for a 25% change in various climatic variables. The results indicated that the relationship between daily ET_{0} and climatic variables is generally linear, except for the relationship with temperature where slight non-linearity was observed in a few cases.

The spatial analysis showed that *R _{n}* and RH

_{mean}are dominating for the per-humid Pattambi station. The two variables, namely RH

_{mean}and

*T*

_{max}, are found to be dominating variables for ET

_{0}estimates in the humid Dharwad station with high influence during the monsoon season. For the moist sub-humid station Bangalore,

*R*and RH

_{n}_{mean}were found to be most influential followed by

*T*

_{max}. In the dry sub-humid station,

*R*was the single most important climatic variable.

_{n}*T*

_{max}was observed as the most influential weather factor for the coastal Rajahmundry station whereas

*R*was found to be dominating in the other coastal station, Anakapalle.

_{n}*T*

_{max}and RH

_{mean}were found to be the sensitive variables for Annamalainagar and Kovilpatti stations. The variation in

*T*

_{max}was found to be highly sensitive to the changes in the estimates of ET

_{0}for the semi-arid stations of Bellary, Hyderabad, and Sholapur. The net radiation at the reference surface is the energy source that drives away moisture as ET

_{0,}and at the same time, it affects other climatic variables also. ET

_{0}is sensitive to temperature, where it is positively correlated, and RH

_{mean,}where it is negatively correlated. Wind speed is also important when

*R*is high and the air is relatively dry.

_{n}*R*) and RH in the humid climate whereas in the hot and dry climate, maximum temperature (

_{n}*T*

_{max}) and wind speed become more influential. On the other hand, seasonal analysis indicates that RH was found to be dominating in the monsoon season for all the stations, whereas

*T*

_{max}and

*R*had maximum influence in the summer. Also from the factor analysis, it was observed that net radiation and RH are the most important climatic variables for determining ET followed by maximum temperature and wind speed. The outcome of sensitivity analysis for various stations is shown in Figure 2.

_{n}The sensitivity is expressed quantitatively in Table 3 as percentage change in ET_{0} for 25% increase in the climatic variables. The overall average changes in ET_{0} values for 25% change in the climatic variables were 18, 16, 14, 7, 5, and 4% for *T*_{max}, RH_{mean,}*R _{n}*

_{,}wind speed,

*T*

_{min}and sunshine hours, respectively. Overall, ET

_{0}was found to be sensitive to radiation (

*R*) for all seasons. Mean relative humidity (RH

_{n}_{mean}) has a high influence but there were seasonal fluctuations.

*T*

_{max}was also found to be sensitive to seasonal variations followed by wind speed. The results are similar to the results of the sensitivity analysis of Meyer

*et al.*(1989) using the original Penman equation as a target, in which RH and

*R*were found to have profound effects on the calculated ET

_{n}_{0}value, whereas wind speed was less critical.

Station . | Season . | T_{max} (°C)
. | T_{min} (°C)
. | RH_{mean} %
. | Wind speed (km day^{−1})
. | Sunshine hours . | Radiation (MJm^{−2}day^{−1})
. |
---|---|---|---|---|---|---|---|

25% . | 25% . | 25% . | 25% . | 25% . | 25% . | ||

Pattambi | Winter | 14 | −8 | −13 | 8 | 3 | 15 |

Summer | 11 | −4 | −10 | 7 | 5 | 15 | |

Monsoon | 14 | −4 | −20 | 4 | 4 | 17 | |

Dharwad | Winter | 27 | −7 | −20 | 9 | 3 | 10 |

Summer | 17 | −6 | −11 | 8 | 4 | 14 | |

Monsoon | 31 | −5 | −67 | 1 | 4 | 18 | |

Bangalore | Winter | 13 | −6 | −13 | 8 | 3 | 13 |

Summer | 5 | −3 | −5 | 6 | 7 | 18 | |

Monsoon | 22 | −4 | −26 | 7 | 1 | 12 | |

Kovilpatti | Winter | 12 | −7 | −13 | 7 | 4 | 17 |

Summer | 9 | −5 | −7 | 5 | 7 | 18 | |

Monsoon | 40 | −11 | −38 | 8 | 1 | 10 | |

Rajahmundry | Winter | 17 | −8 | −17 | 7 | 3 | 15 |

Summer | 17 | −5 | −12 | 8 | 5 | 13 | |

Monsoon | 28 | −10 | −15 | 4 | 4 | 18 | |

Anakapalle | Winter | 9 | −5 | −9 | 9 | 2 | 13 |

Summer | 12 | −4 | −11 | 6 | 7 | 17 | |

Monsoon | 7 | −4 | −8 | 5 | 5 | 18 | |

Annamalainagar | Winter | 18 | −6 | −24 | 5 | 5 | 16 |

Summer | 17 | −6 | −13 | 6 | 6 | 15 | |

Monsoon | 25 | −5 | −20 | 9 | 3 | 10 | |

Bellary | Winter | 10 | −6 | −11 | 8 | 4 | 16 |

Summer | 9 | −3 | −3 | 11 | 4 | 11 | |

Monsoon | 31 | −1 | −22 | 10 | 1 | 6 | |

Sholapur | Winter | 14 | −4 | −3 | 11 | 1 | 10 |

Summer | 14 | −3 | −2 | 11 | 3 | 10 | |

Monsoon | 26 | −8 | −14 | 4 | 3 | 17 | |

Hyderabad | Winter | 15 | −5 | −11 | 10 | 2 | 12 |

Summer | 18 | −4 | −7 | 11 | 4 | 10 | |

Monsoon | 25 | −4 | −32 | 6 | 4 | 12 |

Station . | Season . | T_{max} (°C)
. | T_{min} (°C)
. | RH_{mean} %
. | Wind speed (km day^{−1})
. | Sunshine hours . | Radiation (MJm^{−2}day^{−1})
. |
---|---|---|---|---|---|---|---|

25% . | 25% . | 25% . | 25% . | 25% . | 25% . | ||

Pattambi | Winter | 14 | −8 | −13 | 8 | 3 | 15 |

Summer | 11 | −4 | −10 | 7 | 5 | 15 | |

Monsoon | 14 | −4 | −20 | 4 | 4 | 17 | |

Dharwad | Winter | 27 | −7 | −20 | 9 | 3 | 10 |

Summer | 17 | −6 | −11 | 8 | 4 | 14 | |

Monsoon | 31 | −5 | −67 | 1 | 4 | 18 | |

Bangalore | Winter | 13 | −6 | −13 | 8 | 3 | 13 |

Summer | 5 | −3 | −5 | 6 | 7 | 18 | |

Monsoon | 22 | −4 | −26 | 7 | 1 | 12 | |

Kovilpatti | Winter | 12 | −7 | −13 | 7 | 4 | 17 |

Summer | 9 | −5 | −7 | 5 | 7 | 18 | |

Monsoon | 40 | −11 | −38 | 8 | 1 | 10 | |

Rajahmundry | Winter | 17 | −8 | −17 | 7 | 3 | 15 |

Summer | 17 | −5 | −12 | 8 | 5 | 13 | |

Monsoon | 28 | −10 | −15 | 4 | 4 | 18 | |

Anakapalle | Winter | 9 | −5 | −9 | 9 | 2 | 13 |

Summer | 12 | −4 | −11 | 6 | 7 | 17 | |

Monsoon | 7 | −4 | −8 | 5 | 5 | 18 | |

Annamalainagar | Winter | 18 | −6 | −24 | 5 | 5 | 16 |

Summer | 17 | −6 | −13 | 6 | 6 | 15 | |

Monsoon | 25 | −5 | −20 | 9 | 3 | 10 | |

Bellary | Winter | 10 | −6 | −11 | 8 | 4 | 16 |

Summer | 9 | −3 | −3 | 11 | 4 | 11 | |

Monsoon | 31 | −1 | −22 | 10 | 1 | 6 | |

Sholapur | Winter | 14 | −4 | −3 | 11 | 1 | 10 |

Summer | 14 | −3 | −2 | 11 | 3 | 10 | |

Monsoon | 26 | −8 | −14 | 4 | 3 | 17 | |

Hyderabad | Winter | 15 | −5 | −11 | 10 | 2 | 12 |

Summer | 18 | −4 | −7 | 11 | 4 | 10 | |

Monsoon | 25 | −4 | −32 | 6 | 4 | 12 |

### Estimation of ET by the ANN model

*R*, MJ m

_{n}^{−2}day

^{−1}), maximum temperature (

*T*

_{max}, °C), minimum temperature (

*T*

_{min}, °C), average relative humidity (RH

_{mean}, %), wind velocity (

*U*, km day

^{−1}), ratio of daily sunshine hours (

*n/N*) and the output was ET

_{0}. In the second case, important climatic variables identified by PCA for the respective stations (Table 3) alone were used for ANN model development. ET

_{0}estimated by the P–M method (Allen

*et al.*1998) was used as the target output. Normalized daily data for a period of four years were used for the model development and one year was used for model testing. ANN network for ET

_{0}estimation with all climatic variables is shown in Figure 3.

_{0}for the period varied from 3.09 mm/day at Anakapalle to 4.46 mm/day at Bellary for ANN with all climatic variables and 3.05–4.40 mm/day for ANNs with PCA variables (Table 4). A similar study by Jain

*et al.*(2008) found that an ANN can accurately estimate ET even with few climate factors. Comparisons of ET estimates by P–M and ANN models are shown in Figure 4. ET

_{0}estimates by the ANN model were very close to ET

_{0}estimate by the P–M model.

Station . | P–M . | ANN . | ANN (PCA) . |
---|---|---|---|

mm/day . | mm/day . | mm/day . | |

Pattambi | 3.29 | 3.36 | 3.36 |

Dharwad | 3.52 | 3.50 | 3.50 |

Bangalore | 3.75 | 3.80 | 3.74 |

Kovilpatti | 4.29 | 4.25 | 4.27 |

Rajahmundry | 3.28 | 3.20 | 3.17 |

Anakapalle | 3.09 | 3.09 | 3.05 |

Annamalainagar | 3.81 | 3.76 | 3.74 |

Sholapur | 4.13 | 4.03 | 4.02 |

Bellary | 4.58 | 4.46 | 4.40 |

Hyderabad | 4.22 | 4.13 | 4.08 |

Station . | P–M . | ANN . | ANN (PCA) . |
---|---|---|---|

mm/day . | mm/day . | mm/day . | |

Pattambi | 3.29 | 3.36 | 3.36 |

Dharwad | 3.52 | 3.50 | 3.50 |

Bangalore | 3.75 | 3.80 | 3.74 |

Kovilpatti | 4.29 | 4.25 | 4.27 |

Rajahmundry | 3.28 | 3.20 | 3.17 |

Anakapalle | 3.09 | 3.09 | 3.05 |

Annamalainagar | 3.81 | 3.76 | 3.74 |

Sholapur | 4.13 | 4.03 | 4.02 |

Bellary | 4.58 | 4.46 | 4.40 |

Hyderabad | 4.22 | 4.13 | 4.08 |

### Performance of ANN models for various agro-climatic regions

One of the multivariate statistical techniques that can be used to simplify the complexity of the input variables is PCA. Most of the variability is contained in the principal components. Although the P–M model is the best, if all climatic data are available, the ANN with principal components offers an alternative for locations where full climatic data are not accessible. The ANN with all variables is the ANN model with all the eight climatic variables used for the computation of ET_{0}. ANN with PCA variables is the ANN model wherein only the climatic variables identified by PCA for the respective station are used for the computation of ET_{0}.

ET_{0} values were ranked based on *R*^{2}, SEE, %, and *z*-test values in comparison to P–M estimate of ET_{0}. Tables 5 and 6 show the applicability of the developed ANN models for different agro-climatic regions based on the calibration and validation statistics. The optimum number of neurons for ET_{0} estimation for each climatic station with their performance indices with all variables and with PCA variables is also given in Tables 5 and 6. It was observed that increasing the number of neurons has increased the efficiency of the model in terms of coefficient of determination (*R*^{2}) and error reduction including root mean square error and standard error of the estimate.

Sl. no. . | Station name . | No. of neurons . | Calibration . | Validation . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | SEE . | SD . | % . | z
. | R^{2}
. | SEE . | SD . | % . | z
. | |||

1 | Pattambi | 20 | 0.999 | 0.02 | 0.71 | 100 | 0.01 | 0.988 | 0.36 | 0.77 | 0.96 | −1.33 |

2 | Dharwad | 20 | 0.999 | 0.03 | 1.26 | 100 | 0.004 | 0.998 | 0.16 | 1.29 | 100 | 0.54 |

3 | Bangalore | 20 | 0.996 | 0.07 | 0.86 | 100 | 0.06 | 0.985 | 0.34 | 0.92 | 81 | −1.64 |

4 | Kovilpatti | 15 | 0.999 | 0.04 | 1.36 | 100 | 0.01 | 0.993 | 0.29 | 1.05 | 94 | 1.02 |

5 | Rajahmundry | 15 | 0.998 | 0.05 | 0.98 | 100 | 0.02 | 0.99 | 0.39 | 0.82 | 92 | 1.6 |

6 | Anakapalle | 15 | 0.998 | 0.04 | 0.81 | 100 | 0.02 | 0.99 | 0.13 | 0.84 | 100 | 0.04 |

7 | Annamalainagar | 15 | 0.996 | 0.11 | 1.13 | 100 | 0.11 | 0.994 | 0.23 | 1.09 | 98 | 0.92 |

8 | Sholapur | 10 | 0.988 | 0.22 | 1.42 | 96 | 0.07 | 0.987 | 0.48 | 1.15 | 80 | 1.69 |

9 | Bellary | 15 | 0.996 | 0.14 | 1.54 | 99 | 0.02 | 0.988 | 0.71 | 1.20 | 38 | 2.26* |

10 | Hyderabad | 25 | 0.999 | 0.06 | 1.61 | 100 | 0.02 | 0.96 | 0.67 | 1.78 | 52 | 1.31 |

Sl. no. . | Station name . | No. of neurons . | Calibration . | Validation . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | SEE . | SD . | % . | z
. | R^{2}
. | SEE . | SD . | % . | z
. | |||

1 | Pattambi | 20 | 0.999 | 0.02 | 0.71 | 100 | 0.01 | 0.988 | 0.36 | 0.77 | 0.96 | −1.33 |

2 | Dharwad | 20 | 0.999 | 0.03 | 1.26 | 100 | 0.004 | 0.998 | 0.16 | 1.29 | 100 | 0.54 |

3 | Bangalore | 20 | 0.996 | 0.07 | 0.86 | 100 | 0.06 | 0.985 | 0.34 | 0.92 | 81 | −1.64 |

4 | Kovilpatti | 15 | 0.999 | 0.04 | 1.36 | 100 | 0.01 | 0.993 | 0.29 | 1.05 | 94 | 1.02 |

5 | Rajahmundry | 15 | 0.998 | 0.05 | 0.98 | 100 | 0.02 | 0.99 | 0.39 | 0.82 | 92 | 1.6 |

6 | Anakapalle | 15 | 0.998 | 0.04 | 0.81 | 100 | 0.02 | 0.99 | 0.13 | 0.84 | 100 | 0.04 |

7 | Annamalainagar | 15 | 0.996 | 0.11 | 1.13 | 100 | 0.11 | 0.994 | 0.23 | 1.09 | 98 | 0.92 |

8 | Sholapur | 10 | 0.988 | 0.22 | 1.42 | 96 | 0.07 | 0.987 | 0.48 | 1.15 | 80 | 1.69 |

9 | Bellary | 15 | 0.996 | 0.14 | 1.54 | 99 | 0.02 | 0.988 | 0.71 | 1.20 | 38 | 2.26* |

10 | Hyderabad | 25 | 0.999 | 0.06 | 1.61 | 100 | 0.02 | 0.96 | 0.67 | 1.78 | 52 | 1.31 |

*Significant at 0.05 level.

Sl. no. . | Station name . | No. of neurons . | Calibration . | Validation . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | SEE . | SD . | % . | z
. | R^{2}
. | SEE . | SD . | % . | z
. | |||

1 | Pattambi | 10 | 0.982 | 0.13 | 0.72 | 99 | −0.03 | 0.977 | 0.37 | 0.63 | 96 | −1.43 |

2 | Dharwad | 10 | 0.982 | 0.24 | 1.23 | 95 | 0.02 | 0.965 | 0.36 | 1.34 | 75 | 0.46 |

3 | Bangalore | 10 | 0.984 | 0.16 | 0.82 | 100 | 0.22 | 0.978 | 0.17 | 0.83 | 100 | 0.00 |

4 | Kovilpatti | 5 | 0.978 | 0.28 | 1.29 | 90 | 0.09 | 0.97 | 0.39 | 1.03 | 73 | 1.09 |

5 | Rajahmundry | 5 | 0.989 | 0.15 | 0.96 | 100 | 0.07 | 0.99 | 0.48 | 0.75 | 67 | 1.9 |

6 | Anakapalle | 6 | 0.99 | 0.10 | 0.79 | 100 | 0.10 | 0.989 | 0.19 | 0.81 | 100 | 0.91 |

7 | Annamalainagar | 9 | 0.996 | 0.10 | 1.16 | 100 | 0.01 | 0.992 | 0.36 | 1.0 | 87 | 1.56 |

8 | Sholapur | 5 | 0.988 | 0.22 | 1.41 | 96 | 0.05 | 0.98 | 0.57 | 1.18 | 50 | 2.00* |

9 | Bellary | 7 | 0.993 | 0.17 | 1.53 | 99 | 0.01 | 0.98 | 0.97 | 1.25 | 8 | 3.32* |

10 | Hyderabad | 10 | 0.988 | 0.25 | 1.57 | 0.94 | 0.03 | 0.97 | 0.83 | 1.44 | 33 | 2.37* |

Sl. no. . | Station name . | No. of neurons . | Calibration . | Validation . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | SEE . | SD . | % . | z
. | R^{2}
. | SEE . | SD . | % . | z
. | |||

1 | Pattambi | 10 | 0.982 | 0.13 | 0.72 | 99 | −0.03 | 0.977 | 0.37 | 0.63 | 96 | −1.43 |

2 | Dharwad | 10 | 0.982 | 0.24 | 1.23 | 95 | 0.02 | 0.965 | 0.36 | 1.34 | 75 | 0.46 |

3 | Bangalore | 10 | 0.984 | 0.16 | 0.82 | 100 | 0.22 | 0.978 | 0.17 | 0.83 | 100 | 0.00 |

4 | Kovilpatti | 5 | 0.978 | 0.28 | 1.29 | 90 | 0.09 | 0.97 | 0.39 | 1.03 | 73 | 1.09 |

5 | Rajahmundry | 5 | 0.989 | 0.15 | 0.96 | 100 | 0.07 | 0.99 | 0.48 | 0.75 | 67 | 1.9 |

6 | Anakapalle | 6 | 0.99 | 0.10 | 0.79 | 100 | 0.10 | 0.989 | 0.19 | 0.81 | 100 | 0.91 |

7 | Annamalainagar | 9 | 0.996 | 0.10 | 1.16 | 100 | 0.01 | 0.992 | 0.36 | 1.0 | 87 | 1.56 |

8 | Sholapur | 5 | 0.988 | 0.22 | 1.41 | 96 | 0.05 | 0.98 | 0.57 | 1.18 | 50 | 2.00* |

9 | Bellary | 7 | 0.993 | 0.17 | 1.53 | 99 | 0.01 | 0.98 | 0.97 | 1.25 | 8 | 3.32* |

10 | Hyderabad | 10 | 0.988 | 0.25 | 1.57 | 0.94 | 0.03 | 0.97 | 0.83 | 1.44 | 33 | 2.37* |

*Significant at 0.05 level.

The ANN model showed good performance in most of the stations except in the semi-arid climatic conditions. The performance of all the models in training was better than testing. For coastal, per-humid and humid climatic stations the models resulted in high performance both during the training period and testing period. The poor performance during the testing period in the case of semi-arid climatic stations indicated the complexity of ET models and questions the applicability of these models for the climate. In dry climatic regions, the fluctuations in climatic variables within a day are much higher than in humid regions. So the daily weather data may not be able to account for the hourly fluctuations accurately. It can be the reason for the poor performance of the ET equations in semi-arid climatic regions. A similar observation was made by Lakshman & Kovoor (2006).

The performance of ET_{0} obtained from the ANN model was compared with the standard P–M model for different agro-climatic regions based on the linguistic notations shown in Table 7, such as very good, good, low and poor, respectively. Performance decreased as it moved from coastal, humid to semi-arid regions. ET_{0} estimates showed poor performance in drier regions compared to humid regions. In dry climatic regions, the fluctuations in climatic variables within a day are much higher than in humid regions and the daily weather data may not be able to account for the hourly fluctuations accurately. This leads to the poor performance of the simplified ET equations in drier climatic regions.

Sl. No. . | Criteria, values . | VG . | G . | L . | P . |
---|---|---|---|---|---|

1 | R^{2} | 0.95 | 0.95–0.60 | 0.60–0.50 | 0.50 |

2 | SEE | 0.30 | 0.30–0.50 | 0.50–0.70 | 0.70 |

3 | % | 95% | 95–70% | 70–50% | 50% |

4 | z-test (95% confidence interval) | Not significant | Not significant | Not significant | Significant |

Sl. No. . | Criteria, values . | VG . | G . | L . | P . |
---|---|---|---|---|---|

1 | R^{2} | 0.95 | 0.95–0.60 | 0.60–0.50 | 0.50 |

2 | SEE | 0.30 | 0.30–0.50 | 0.50–0.70 | 0.70 |

3 | % | 95% | 95–70% | 70–50% | 50% |

4 | z-test (95% confidence interval) | Not significant | Not significant | Not significant | Significant |

VG, very good; G, good; L, low; P, poor.

According to the criteria, the ANN with all variables had very good performance, and ANN with PCA variables had good performance for coastal, per-humid, and humid climatic conditions. For moist sub-humid climates, both ANN models had good performance, whereas dry sub-humid climates had low and semi-arid climates had poor performance for both the ANN models (Table 8).

Method . | Coastal . | Per-humid . | Humid . | Moist sub-humid . | Dry sub-humid . | Semi-arid . |
---|---|---|---|---|---|---|

ANN (all variables) | VG | VG | VG | G | L | P |

ANN (PCA variables) | G | G | G | G | L | P |

Method . | Coastal . | Per-humid . | Humid . | Moist sub-humid . | Dry sub-humid . | Semi-arid . |
---|---|---|---|---|---|---|

ANN (all variables) | VG | VG | VG | G | L | P |

ANN (PCA variables) | G | G | G | G | L | P |

VG, very good; G, good; L, low; P, poor.

The poor performance during the testing period in the semi-arid climatic stations indicates the complexity of ET models for these stations. Statistical analysis showed that ET_{0} estimates in Bellary station showed a significant difference between the P–M model and the ANN model with all climatic variables in the validation period, but there was no significant difference in the calibration phase. Similarly semi-arid stations Sholapur, Bellary, and Hyderabad showed significant differences between ET estimate by P–M model and ANN model with PCA variables in the validation period, but there was no significant difference in the calibration phase. The variation in climatic variables within a day is significantly bigger in dry climate regions than it is in humid climate regions. As a result, the developed ET equations perform poorly in places with drier climates. A similar observation was made by Lakshman & Kovoor (2006). Gallego-Elvira *et al.* (2012) and McJannet *et al.* (2013) also state that the P–M method which takes into account heat storage and aerodynamic resistance is preferred in semi-arid climatic regions.

The ANN model with all the climatic variables performed better than the ANN model with variables resulting from PCA. ET estimates by the ANN model with PCA variables produced good results, but the performance during validation was poor for semi-arid stations due to the hourly fluctuations in weather in the drier climates. The study demonstrates the applicability of ANN models for various agro-climatic areas under limited data conditions.

## CONCLUSIONS

ET is a complex process involving several meteorological parameters. The choice of an ET estimation method is determined by the local climate and the availability of meteorological data. Though ET estimation by the data-intensive P–M model is accurate, its applicability is limited due to the unavailability of many of the parameters for several Indian stations. So, less data-intensive computation techniques such as ANN models are found to be very useful in accurately predicting ET_{0} for efficient agricultural water management. PCA was used to identify the important climatic factors for various regions, and these important climatic variables were used to build ANN-ET. Based on the study for Indian climate *T*_{max}, RH_{mean}, and *R _{n}* are found to be the most sensitive parameters for ET computation.

Statistical analysis proved that ANN models performed very well in coastal, per-humid, and humid climate regions and can be very effectively used for the estimation of ET_{0}, but the model performance was slightly poor in warmer regions. Accurate estimation of ET is a challenge in drier climates due to hourly variations in weather conditions. In dry climate regions, daily variations in meteorological factors are substantially higher than in humid climate regions, and the hourly variations may not be fully accounted for by daily weather data. An hourly model will be able to address the issue in arid regions. Although the P–M model is the best if all climatic data are available, ANN with principle components offers an alternative for locations where full climatic data are not accessible. Hence, less complex ANN models can be employed, as they are reasonably accurate as the classic P–M model, though they do not require as many parameters.

Future research can focus on the assessment of the ET accounting hourly performance of ANN models for agro-climatic zones. Also, other soft computing techniques can be tried for the estimation of ET.

## ACKNOWLEDGEMENTS

The authors are thankful to the Indian Institute of Technology Madras, for providing the laboratory facilities for doing the research work.

## FUNDING

No funds, grants, or other support were received.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.