## Abstract

Reference evapotranspiration (ET_{0}) is used to determine crop water requirements under different climatic conditions. In this study, soft computing tools viz. artificial neural network (ANN) and k-nearest neighbors (KNN) models were evaluated for forecasting daily ET_{0} by comparing their performance with the Penman-Monteith model (PM) using climatic data from 1990 to 2020 of the Indian Agricultural Research Institute (IARI) farm observatory, New Delhi, India. The performance of these models was assessed using statistical performance indices viz., mean absolute error (MAE), mean squared error (MSE), correlation coefficient (r), mean absolute percentage error (MAPE), and index of agreement (d). Results revealed that the ANN model with sigmoid activation function and L-BFGS (Limited memory-Broyden-Fletcher-Goldfarb-Shanno) learning algorithm was selected as the best performing model amongst 36 ANN models. Amongst 4 KNN models developed and tested, the K4 KNN model was observed to be the best in forecasting daily ET_{0}. Overall, the best ANN model (M11) outperformed the K4 KNN model with MAE, MSE, r, MAPE, and d values of 0.075, 0.018, 0.997, 2.76 %, and 0.974, respectively and 0.091, 0.053, 0.984, 3.16 %, and 0.969, respectively during training and testing periods. Thus, we conclude that the ANN technique performed better than the KNN technique in forecasting daily ET_{0}. Sensitivity analysis of the best ANN model revealed that wind speed was the most influential input variable compared to other weather parameters. Thus, the ANN model to forecast daily ET_{0} accurately for efficient irrigation scheduling of different crops in the study region may be recommended.

## HIGHLIGHTS

Regression analysis was performed for input variables selection.

In this study, the performance of 36 ANN models and 4 KNN models was evaluated.

The ANN model with sigmoid activation function and L-BFGS algorithm was selected as the best-performing model.

The best ANN model (M11) outperformed the K4 KNN model to forecast daily ET

_{0}.Sensitivity analysis revealed that wind speed was the most influencing input variable.

### Graphical Abstract

## INTRODUCTION

Water is the essential and most critical input for sustainable agriculture (Hyder *et al.* 2022). It is a valuable contribution of nature to the existence and survival of life on earth (Manikumari *et al.* 2017). But the distribution of water resources has changed in recent years due to climate change effects, which have been causing disaster events like floods and drought (Suwarno *et al.* 2021; Krisnayanti *et al.* 2022). Change in the occurrence and distribution of rainfall has affected the availability of freshwater resources at spatial and temporal scales (Chutiman *et al.* 2022). Moreover, increasing water demand due to a burgeoning population and changing climate necessitates judicious management of water resources in agriculture (Manikumari *et al.* 2017). Globally, about 70% of water drained from rivers and aquifers is diverted for agricultural activities (Shang *et al.* 2020). Therefore, evapotranspiration (ET), a significant component of the hydrologic cycle, needs special attention for accurate quantification. The reference evapotranspiration (ET_{0}) expresses the simultaneous occurrence of the soil water evaporation and the crop transpiration processes on a vegetated surface (Allen *et al.* 1998). Real-time computations of actual crop evapotranspiration (ET_{c}) require forecasting of daily reference evapotranspiration (ET_{0}). Therefore, precise ET_{0} quantification is necessary for determining irrigation demands, irrigation scheduling, water resource management, environmental impact assessment, and water balance studies at regional and local levels, and for its use in various rainfall-runoff and ecosystem models (Nema *et al.* 2017; Poddar *et al.* 2021).

The Food and Agriculture Organization (FAO) of the United Nations lately embraced a standardized form of the Penman–Monteith (PM) equation to establish a globally valid benchmark for calculating ET_{0}, constructing crop coefficient curves for crops, and analyzing and calibrating other ET_{0} methodologies in situations where weighing-type field lysimeter measurements are inaccessible (Allen *et al.* 1998; Cai *et al.* 2007; Kumar *et al.* 2012; Garg *et al.* 2016; Kushwaha *et al.* 2021). Several process-based and empirical equations based on radiation, temperature, mass transfer, and water budget methods have been derived to determine ET_{0} with different input combinations of meteorological parameters (Majhi & Naidu 2021). These methods are physically based and require input data (Allen *et al.* 1998). Several authors have utilized these empirical models under data-limited conditions (Tabari 2010; Poddar *et al.* 2021; Pandey & Pandey) to estimate ET_{0}. In addition, irrigation scheduling software developed for different crops requires an accurate value of ET_{0} and the crop coefficient values at different crop growth stages (Genaidy 2020).

Applications of soft computing techniques have become reliable in solving complex problems (Wagstaff 2012; Kumar *et al.* 2016; Hameed *et al.* 2021). Due to its high performance and the decisive advantage of capturing nonlinear and complex structures, machine learning is widely applied to different fields, namely, marketing for task classification, finance for forecasting, telecommunications for forecasting and task classification, and network analysis for relating different tasks (Bhandari 2021). Also, several studies have used machine learning algorithms to estimate reference evapotranspiration (ET_{0}) (Chia *et al.* 2020, 2022; Hanoon *et al.* 2021; Rai *et al.* 2022). ET_{0} forecasting models were developed using soft computing tools, namely, support vector machine (SVM), gradient boosting decision tree (GBDT), particle swarm optimization (PSO) SVM, and PSO-GBDT algorithms. Results indicated that SVM, GBDT, PSO-SVM, and PSO-GBDT models generally resulted in better forecasting of ET_{0}, whereas the PSO-GBDT algorithm performed better for the Southwest China region (Zhao *et al.* 2021).

Artificial neural network (ANN) modeling permits easier translation between humans and computers for decision-making and a better way to handle imprecise and uncertain information (Fernández-López *et al.* 2020). Multiple linear regression (MLR) showed that the ANN and Gene Expression Programming (GEP) models were superior to the MLR models in forecasting daily reference evapotranspiration at Pantnagar, India (Heramb *et al.* 2022). In another study, a comparison of the Feed Forward Neural Network (FFNN), Radial Basis Function Neural Network (RBFNN), and GEP machine learning algorithms for estimating daily ET_{0} in the Lower Cheliff Plain, northwest Algeria, was done (Achite *et al.* 2022). Results concluded that the RBFNN and GEP models showed promising performance. However, the FFNN model performed the best during training (*R*^{2} = 0.9903, root-mean-square error (RMSE) = 0.2332, and EF = 0.9902) and testing (*R*^{2} = 0.9921, RMSE = 0.2342, and EF = 0.9902) phases in forecasting ET_{0} as compared with the PM evapotranspiration. The ANN model outperformed the linear regression model in forecasting daily evaporation (Singh *et al.* 2021). The ANN model with a 4-10-1 ANN structure was recommended to perform best for evaporation at the National Institute of Technology (NIH), Roorkee, India. Deep learning (DL) is also extensively used for forecasting hydrological variables. The performance of four learning algorithms, namely, DL-Multilayer Perceptron (DLMP), Generalized Linear Model (GLM), Random Forest (RF), and Gradient Boosting Machine (GBM), was assessed to forecast future ET_{0} for Hoshiarpur and Patiala district, Punjab, India. The study concluded that the DLMP outperformed other models during the training, validation, and testing stages, and the statistical performance indicator values for DLMP models ranged between 0.95 and 0.98 for NSE, 0.95 and 0.99 for the coefficient of determination (*R*^{2}), 85 and 95 for the loss and accuracy (ACC), 0.0369 and 0.1215 for mean squared error (MSE), and 0.1921 and 0.2691 for RMSE.

Machine learning approaches, namely, *k*-nearest-neighbor (KNN) and ANN models, were applied to forecast daily ET_{0} using four combinations of climatic data in the Middle Anatolia region of Turkey. They concluded that the KNN had higher performances than the ANN in all combinations under full and limited data conditions (Yamaç 2021). The potential of algorithm KNN, a data mining method for estimating ET_{0}, was investigated using limited climatic data in a semi-arid environment in China. Results showed that the input variables of the KNN-based ET_{0} forecast model, which requires maximum air temperature, minimum air temperature, and relative humidity, had the best accuracy (Feng & Tian 2021). The performance of soft computing algorithms, namely, Gaussian Naive Bayes (GNB), SVM, KNN, and ANN, were compared for ET_{0} estimation against the PM model for Pakistan climatic conditions. The results showed that the KNN model was more accurate than SVM, GNB, and ANN models, with 92% accuracy (Hu *et al.* 2022).

The performance of the ANN and KNN soft computing techniques in forecasting different real-world problems depends on various factors. The choice of activation function and solver using ANN techniques determine the ANN performance in forecasting daily ET_{0}. Also, the number of neighbors (*k*) value using the KNN technique affects its efficacy in forecasting ET_{0}. Several authors have applied different machine learning algorithms to forecast ET_{0} in different climatic conditions. However, less information is available on the application of different ANN learning algorithms in forecasting daily ET_{0} with a sigmoid activation function. Previous studies using KNN techniques lacked the effect of the number of neighbors on the algorithm's performance in forecasting daily ET_{0}. Keeping in view the research gap in forecasting daily ET_{0} using soft computing tools in previous studies, the present study was undertaken to assess the performance of ANN and KNN techniques for precise forecasting of daily ET_{0}. The novelty of this investigation is the application of the sigmoid activation function along with three activation learning algorithms for ANN techniques and the optimization of the number of neighbors for the KNN technique to forecast daily ET_{0}. The sigmoid activation function tested three learning algorithms, namely, L-BFGS-B (Limited memory-Broyden–Fletcher–Goldfarb–Shanno), stochastic gradient descent (SGD), and Adam, for daily forecasting of ET_{0.} Also, the selection of ANN techniques based on an optimum number of neurons in the hidden layer for ANN-based ET_{0} forecasting models was carried out in this study. Additionally, the optimum number of nearest neighbors in the KNN algorithm, which influences the model performance in forecasting daily ET_{0}, has been worked out for semi-arid climates. Another novelty of this work is performing the sensitivity analysis of the best daily ET_{0} forecasting model to identify the weather variables that most influence its performance. Thus, the current study was carried out with the objective of assessing the effectiveness of the ANN and KNN soft computing approaches in forecasting daily ET_{0} at the Indian Agricultural Research Institute (IARI) farm in New Delhi, India, as well as to perform sensitivity analysis of the best ET_{0} forecasting model to detect the most affecting input variables. The developed model may aid in accurate quantification of daily basis ET_{0}, which may be utilized to compute crop water requirements considering stage-wise water requirements for crops.

## MATERIALS AND METHODS

### Study area description

^{−1}. The average wind speed (WS) varies from 0.45 to 3.96 m/s. The location map of the study area is shown in Figure 1.

### Data collection and ET_{0} calculation

*T*max and

*T*min), wind velocity (WS), daily maximum and minimum relative humidity (RHI and RHII), and sunshine hours (SS) were collected during the period from 1990 to 2020 from the observatory, Division of Agricultural Physics, Indian Council of Agricultural Research-Indian Agricultural Research Institute (ICAR-IARI), New Delhi. The estimation of daily ET

_{0}was carried out using Reference ET software (REF-ET). The REF-ET software is an open source software developed by Kimberly Research and Extension Center, Idaho, USA, and used by Jothiprakash

*et al.*(2002), among others, for ET

_{0}estimation. Solar radiation (SR) was derived using CROPWAT 8.0 software using daily meteorological variable data. The REF-ET uses the Food and Agriculture Organization's recommended PM Allen

*et al.*(2006) model for ET

_{0}calculation. The PM equation for ET

_{0}calculation is given by:where ET

_{0}is the reference evapotranspiration (mm day

^{−1}),

*R*is the net radiation (MJ m

_{n}^{−2}day

^{−1}),

*G*is the soil heat flux (MJ m

^{−2}day

^{−1}),

*γ*is the psychrometric constant (kPa °C

^{−1}),

*e*is the saturation vapor pressure (kPa),

_{s}*e*is the actual vapor pressure (kPa), Δ is the slope of the saturation vapor pressure–temperature curve (kPa °C

_{a}^{−1}),

*T*

_{mean}is the daily average temperature (°C), and

*u*

_{2}is the daily wind velocity (m/s).

### Soft computing techniques

*et al.*2012). ANNs are typically made up of layers of neurons, weights that denote the strength of interconnections, and a transfer or activation function. The links between weights (

*W*) and biases (

*B*) connect the input layer I to the hidden layer (

*j*), which, in turn, connects to the output layer (

*k*) (

*B*). The weights alter the throughput characteristics as well as the linkages to the neurons (

*n*). Biases are used as extra components within the hidden and output layer neurons. The neuron (processing element) in the hidden layer consists of aggregating weighted inputs to produce a quantity-weighted input (activation value). In the hidden layer, the neuron's activation value (

*h*) is mathematically presented using the following equation (Haykin 1998)

_{j}The activation functions (*f*) of the hidden layer neurons serve to transform the input variables (the activation levels of the neurons) into the needed output variable. The sigmoid and hyperbolic tangent transfer functions are the most commonly used in hydrological modeling (Dawson & Wilby 2001; Zanetti *et al.* 2007).

#### Activation function

*et al.*2003; Alves

*et al.*2017; Amir-Ashayeri

*et al.*2021) applied logistic sigmoid activation function for ET

_{0}forecasting and found satisfactory results. In this study, the logistic sigmoid activation function is used and expressed by the following equation:where

*x*is either the value of

*n*or

_{ij}*n*.

_{jk}#### ANN algorithms

Initially, ANN models in ET research focused solely on the back-propagation learning algorithm (Kuo *et al.* 2011; Nazari & Band 2018). However, in recent years, these models have been further refined by exploiting alternative learning algorithms, such as the radial basis function (Wu *et al.* 2012; Du & Swamy 2014; Majhi & Naidu 2021), quick propagation (Landeras *et al.* 2008), Levenberg–Marquardt (Poddar *et al.* 2021), and conjugate gradient descent (Landeras *et al.* 2008). The present study used three learning algorithms, namely, L-BFGS-B learning algorithm, SGD learning algorithm, and Adaptive Moment Estimation (Adam).

#### L-LFGS-B algorithm

The L-BFGS-B algorithm extends the L-BFGS algorithm to handle simple bounds on the model (Zhu *et al.* 1997). The BFGS–ANN algorithm enhanced the performance of the ANN model in ET_{0} estimation for drought-prone arid and semi-arid regions over Gaussian process regression (GPR) and support vector regression (SVR), and long short-term memory (LSTM) models (Sattari *et al.* 2021).

#### SGD algorithm

where is the updated weight; is the previous value of the weight; is the learning rate of 0.1, and *E* is the output error computed by the objective function (Walls *et al.* 2020).

#### Adam algorithm

Adam optimization is applied to calculate the adaptive training rate of the parameters. The ADAM optimization algorithm was used in previous studies for ET_{0} estimation (Sattari *et al.* 2020).

#### The ANN structure

*et al.*2020). This study used the ANN model with seven input layers, one output layer, and one hidden layer with varying (1–12) neurons in the hidden layer to develop the best-performing model. The optimum number of neurons in the hidden layer was found by the trial-and-error approach. The ANN schematic diagram used in the present investigation is shown in Figure 2.

#### KNN technique

*et al.*2008). It is non-parametric, easy to implement, and yields efficient and competitive results. This advantage makes the method more significant than other machine learning methods. The generalized architecture of KNN is displayed in Figure 3. Figure 4 shows the methodology adopted in the present study.

### Performance evaluation

The performance of the ANN and KNN soft computing techniques was evaluated using five commonly used statistical indices to determine the best model for forecasting daily ET_{0}. These indices are as follows:

## RESULTS AND DISCUSSION

### Input feature selection

The input feature was selected using MLR approaches between dependent (ET_{0}) and independent (meteorological variables). Based on the values of statistical indices, a combination of all seven independent variables was selected for forecasting the ET_{0}. The input selection results showed that combining all meteorological variables resulted in the lowest MSE value of 0.13. Similarly, other statistical indices, namely, *R*^{2}, adjusted *R*^{2}, Mallows’ Cp, Schwarz information criterion (SBC), the Schwarz information criterion (SBC) and Amemia's prediction criterion (PC), were 0.96, 0.96, 8, −23,197.80, −23,139.12, and 0.04, respectively, and are shown by blue color in Table 1. For all other variable combinations, the values of input selection indices are displayed in Table 1. The Type III sum-of-square statistics indicated that variable *T*max was the most influential among the ET_{0} influencing variables. At the same time, RHII was the least influential variable among all seven variables.

No. of variables . | Variables . | MSE . | R^{2}
. | Adjusted R²
. | Mallows’ Cp . | Akaike's AIC . | Schwarz's SBC . | Amemiya's PC . |
---|---|---|---|---|---|---|---|---|

1 | SR | 0.80 | 0.73 | 0.73 | 58,801.90 | −2,555.61 | −2,540.94 | 0.27 |

2 | Tmax/SR | 0.41 | 0.86 | 0.86 | 24,255.50 | −10,237.94 | −10,215.93 | 0.14 |

3 | Tmax/WS/SR | 0.17 | 0.94 | 0.94 | 3,401.70 | −20,229.50 | −20,200.16 | 0.06 |

4 | Tmax/WS/RHI/SR | 0.15 | 0.95 | 0.95 | 2,093.33 | −21,283.44 | −21,246.77 | 0.05 |

5 | Tmin/Tmax/WS/RHI/SR | 0.13 | 0.96 | 0.96 | 217.47 | −22,990.16 | −22,946.15 | 0.05 |

6 | Tmin/Tmax/WS/RHI/SS/SR | 0.13 | 0.96 | 0.96 | 103.33 | −23,102.82 | −23,051.47 | 0.04 |

7 | Tmin/Tmax/WS/RHI/RHII/SS/SR | 0.13 | 0.96 | 0.96 | 8.00 | −23,197.80 | −23,139.12 | 0.04 |

No. of variables . | Variables . | MSE . | R^{2}
. | Adjusted R²
. | Mallows’ Cp . | Akaike's AIC . | Schwarz's SBC . | Amemiya's PC . |
---|---|---|---|---|---|---|---|---|

1 | SR | 0.80 | 0.73 | 0.73 | 58,801.90 | −2,555.61 | −2,540.94 | 0.27 |

2 | Tmax/SR | 0.41 | 0.86 | 0.86 | 24,255.50 | −10,237.94 | −10,215.93 | 0.14 |

3 | Tmax/WS/SR | 0.17 | 0.94 | 0.94 | 3,401.70 | −20,229.50 | −20,200.16 | 0.06 |

4 | Tmax/WS/RHI/SR | 0.15 | 0.95 | 0.95 | 2,093.33 | −21,283.44 | −21,246.77 | 0.05 |

5 | Tmin/Tmax/WS/RHI/SR | 0.13 | 0.96 | 0.96 | 217.47 | −22,990.16 | −22,946.15 | 0.05 |

6 | Tmin/Tmax/WS/RHI/SS/SR | 0.13 | 0.96 | 0.96 | 103.33 | −23,102.82 | −23,051.47 | 0.04 |

7 | Tmin/Tmax/WS/RHI/RHII/SS/SR | 0.13 | 0.96 | 0.96 | 8.00 | −23,197.80 | −23,139.12 | 0.04 |

_{0}estimation is given as follows:where ET

_{0}is in mm day

^{−1},

*a*1,

*a*2,

*a*3,

*a*4,

*a*5, and

*a*6 are the regression coefficients, and

*c*is the intercept. The regression coefficients between ET

_{0}and the independent features, namely,

*T*min,

*T*max, WS, RHIO, RHII, SS, and SR, were 0.036, 0.059, 0.156, −0.012, −0.003, −0.031, and 0.156, respectively. The intercept (

*c*) was −0.786. The standardized coefficients obtained for independent features and the regression analysis are presented in Figure 5.

_{0}is shown in Figure 6. It is understood from Figure 6 that the ET

_{0}is highly positively correlated with the

*T*max,

*T*min, and SR having Pearson's correlation coefficient (

*r*) values of 0.74, 0.85, and 0.85, respectively. Independent features, namely, WS and SS, were moderately positively correlated with the ET

_{0}having

*r*values of 0.56 and 0.55, respectively. However, a highly negative correlation (

*r*= −0.70) was observed between ET

_{0}and RHI, which exhibited a negative relationship between these two variables. In addition, a negative correlation was found between RHII and ET

_{0}. A positive correlation indicates a direct relationship between the independent and dependent variables and

*vice versa*.

### Performance evaluation of soft computing techniques

#### Performance of ANN models

The statistical parameter indices, namely, MAE, MSE, *r*, MAPE, and *d*, were used to evaluate the performance of the ANN model for ET_{0} forecasting. The number of neurons in the hidden layer varied from 1 to 12 and stopped after it as further increments resulted in the poor performance of the ANN in ET_{0} forecasting. The performance of the algorithms is displayed in Table 2. The results indicated that the combination of sigmoid as an activation function and L-BFGS-B as a solver performed the best. The model's performance with the ANN architecture (7,11,1, i.e., M11) was the best among all 12 models. The best model reported values of indices, namely, MAE, MSE, *r*, MAPE, and *d*, were 0.075, 0.018, 0.997, 2.76%, and 0.974, respectively, during the training period. The performance of model 11 (7,11,1) was poor during the testing period compared to the training period. The values of the performance indices were found to be 0.091, 0.053, 0.984, 3.16%, and 0.969 for MAE, MSE, *r*, MAPE, and *d*, respectively, for model M11 during the testing period. The performance of model M2 (7,2,1) was found to be worst during the training and testing period based on the statistical performance indicator values among all the 12 models developed using sigmoid as an activation function and L-BFGS-B as a solver. The performance of the models significantly improved by increasing the number of neurons in the hidden layers except model M2, which showed poor performance compared to the M1 model. For the M12 model, the MAE and MSE indices improved by 72.72% compared to the M1 model during the training period. Similarly, the MAPE showed an improvement of 12.93% compared to the M1 model during the training period. However, the performance indices, namely, *r* and *d*, showed slight improvement for the M12 model over M1. The *r* and *d* had increased by 0.10 and 0.73%, respectively, during the model training period. The values of performance indicators during the model testing period showed improvement for MAE, MSE, MAPE, and *d*, and the performance was improved by 10.76, 12.67, 3.87, and 0.52% for the best model M12 compared to the M1 model. However, *r* showed a slight reduction in the model's performance and its value was reduced by 0.52%. Overall, we found that the performance of the ANN using sigmoid activation function and an L-BFGS-B solver to forecast daily ET_{0} had improved by increasing the number of neurons in the hidden layer.

Solver . | Model (ANN structure) . | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE (%) . | d
. | MAE . | MSE . | r
. | MAPE (%) . | d
. | ||

L-BFGS-B | M1 (7,1,1) | 0.121 | 0.033 | 0.994 | 4.33 | 0.958 | 0.130 | 0.071 | 0.988 | 4.39 | 0.955 |

M2 (7,2,1) | 0.184 | 0.072 | 0.988 | 6.27 | 0.936 | 0.160 | 0.075 | 0.987 | 5.32 | 0.945 | |

M3 (7,3,1) | 0.136 | 0.043 | 0.993 | 4.61 | 0.953 | 0.134 | 0.069 | 0.988 | 4.68 | 0.954 | |

M4 (7,4,1) | 0.120 | 0.036 | 0.994 | 4.12 | 0.958 | 0.128 | 0.083 | 0.985 | 4.37 | 0.956 | |

M5 (7,5,1) | 0.096 | 0.025 | 0.996 | 3.40 | 0.967 | 0.108 | 0.059 | 0.990 | 3.82 | 0.962 | |

M6 (7,6,1) | 0.092 | 0.022 | 0.996 | 3.15 | 0.968 | 0.107 | 0.061 | 0.989 | 3.69 | 0.963 | |

M7 (7,7,1) | 0.109 | 0.027 | 0.995 | 4.09 | 0.962 | 0.121 | 0.055 | 0.990 | 4.61 | 0.958 | |

M8 (7,8,1) | 0.122 | 0.038 | 0.994 | 4.28 | 0.958 | 0.128 | 0.068 | 0.988 | 4.62 | 0.956 | |

M9 (7,9,1) | 0.182 | 0.070 | 0.988 | 6.11 | 0.937 | 0.160 | 0.075 | 0.987 | 5.26 | 0.945 | |

M10 (7,10,1) | 0.078 | 0.020 | 0.997 | 2.80 | 0.973 | 0.092 | 0.055 | 0.990 | 3.05 | 0.968 | |

M11 (7,11,1) | 0.075 | 0.018 | 0.997 | 2.76 | 0.974 | 0.091 | 0.053 | 0.984 | 3.16 | 0.969 | |

M12 (7,12,1) | 0.099 | 0.027 | 0.995 | 3.77 | 0.965 | 0.116 | 0.062 | 0.980 | 4.22 | 0.960 | |

SGD | M13 (7,1,1) | 0.299 | 0.178 | 0.969 | 11.87 | 0.896 | 0.271 | 0.147 | 0.974 | 11.17 | 0.906 |

M14 (7,2,1) | 1.440 | 2.932 | 0.556 | 59.10 | 0.001 | 1.455 | 2.883 | 0.667 | 61.79 | 0.002 | |

M15 (7,3,1) | 0.355 | 0.238 | 0.959 | 14.32 | 0.877 | 0.330 | 0.198 | 0.965 | 13.87 | 0.886 | |

M16 (7,4,1) | 0.285 | 0.157 | 0.973 | 11.23 | 0.901 | 0.257 | 0.130 | 0.977 | 10.47 | 0.911 | |

M17 (7,5,1) | 0.268 | 0.143 | 0.975 | 10.54 | 0.907 | 0.242 | 0.122 | 0.979 | 9.79 | 0.916 | |

M18 (7,6,1) | 0.318 | 0.195 | 0.966 | 12.70 | 0.889 | 0.292 | 0.161 | 0.972 | 12.06 | 0.899 | |

M19 (7,7,1) | 0.333 | 0.211 | 0.963 | 13.35 | 0.885 | 0.307 | 0.178 | 0.969 | 12.81 | 0.894 | |

M20 (7,8,1) | 0.379 | 0.264 | 0.954 | 15.15 | 0.869 | 0.355 | 0.222 | 0.961 | 14.74 | 0.877 | |

M21 (7,9,1) | 0.288 | 0.159 | 0.972 | 11.25 | 0.900 | 0.257 | 0.130 | 0.977 | 10.42 | 0.911 | |

M22 (7,10,1) | 0.279 | 0.152 | 0.974 | 10.84 | 0.903 | 0.249 | 0.126 | 0.978 | 10.02 | 0.914 | |

M23 (7,11,1) | 1.440 | 2.934 | 0.198 | 59.10 | 0.000 | 1.455 | 2.884 | 0.181 | 61.80 | 0.000 | |

M24 (7,12,1) | 1.440 | 2.934 | 0.031 | 59.12 | 0.000 | 1.455 | 2.885 | 0.003 | 61.82 | 0.000 | |

Adam | M25 (7,1,1) | 0.212 | 0.114 | 0.980 | 8.99 | 0.927 | 0.204 | 0.110 | 0.981 | 8.72 | 0.929 |

M26 (7,2,1) | 0.167 | 0.085 | 0.985 | 7.35 | 0.942 | 0.167 | 0.087 | 0.985 | 7.28 | 0.942 | |

M27 (7,3,1) | 0.171 | 0.092 | 0.984 | 7.51 | 0.940 | 0.172 | 0.088 | 0.985 | 7.52 | 0.941 | |

M28 (7,4,1) | 0.194 | 0.111 | 0.981 | 8.46 | 0.933 | 0.187 | 0.105 | 0.982 | 8.27 | 0.935 | |

M29 (7,5,1) | 0.166 | 0.082 | 0.986 | 7.18 | 0.942 | 0.162 | 0.083 | 0.986 | 6.97 | 0.944 | |

M30 (7,6,1) | 0.211 | 0.126 | 0.978 | 9.42 | 0.927 | 0.211 | 0.123 | 0.978 | 9.47 | 0.927 | |

M31 (7,7,1) | 0.434 | 0.500 | 0.911 | 14.37 | 0.849 | 0.413 | 0.421 | 0.924 | 14.14 | 0.857 | |

M32 (7,8,1) | 0.164 | 0.081 | 0.986 | 7.22 | 0.943 | 0.162 | 0.082 | 0.986 | 7.10 | 0.944 | |

M33 (7,9,1) | 0.171 | 0.076 | 0.987 | 7.08 | 0.941 | 0.171 | 0.087 | 0.985 | 6.97 | 0.941 | |

M34 (7,10,1) | 0.196 | 0.098 | 0.983 | 8.31 | 0.932 | 0.195 | 0.102 | 0.982 | 8.26 | 0.932 | |

M35 (7,11,1) | 0.224 | 0.183 | 0.968 | 7.18 | 0.922 | 0.206 | 0.147 | 0.964 | 6.75 | 0.929 | |

M36 (7,12,1) | 0.318 | 0.346 | 0.939 | 9.39 | 0.890 | 0.293 | 0.281 | 0.948 | 8.81 | 0.899 |

Solver . | Model (ANN structure) . | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE (%) . | d
. | MAE . | MSE . | r
. | MAPE (%) . | d
. | ||

L-BFGS-B | M1 (7,1,1) | 0.121 | 0.033 | 0.994 | 4.33 | 0.958 | 0.130 | 0.071 | 0.988 | 4.39 | 0.955 |

M2 (7,2,1) | 0.184 | 0.072 | 0.988 | 6.27 | 0.936 | 0.160 | 0.075 | 0.987 | 5.32 | 0.945 | |

M3 (7,3,1) | 0.136 | 0.043 | 0.993 | 4.61 | 0.953 | 0.134 | 0.069 | 0.988 | 4.68 | 0.954 | |

M4 (7,4,1) | 0.120 | 0.036 | 0.994 | 4.12 | 0.958 | 0.128 | 0.083 | 0.985 | 4.37 | 0.956 | |

M5 (7,5,1) | 0.096 | 0.025 | 0.996 | 3.40 | 0.967 | 0.108 | 0.059 | 0.990 | 3.82 | 0.962 | |

M6 (7,6,1) | 0.092 | 0.022 | 0.996 | 3.15 | 0.968 | 0.107 | 0.061 | 0.989 | 3.69 | 0.963 | |

M7 (7,7,1) | 0.109 | 0.027 | 0.995 | 4.09 | 0.962 | 0.121 | 0.055 | 0.990 | 4.61 | 0.958 | |

M8 (7,8,1) | 0.122 | 0.038 | 0.994 | 4.28 | 0.958 | 0.128 | 0.068 | 0.988 | 4.62 | 0.956 | |

M9 (7,9,1) | 0.182 | 0.070 | 0.988 | 6.11 | 0.937 | 0.160 | 0.075 | 0.987 | 5.26 | 0.945 | |

M10 (7,10,1) | 0.078 | 0.020 | 0.997 | 2.80 | 0.973 | 0.092 | 0.055 | 0.990 | 3.05 | 0.968 | |

M11 (7,11,1) | 0.075 | 0.018 | 0.997 | 2.76 | 0.974 | 0.091 | 0.053 | 0.984 | 3.16 | 0.969 | |

M12 (7,12,1) | 0.099 | 0.027 | 0.995 | 3.77 | 0.965 | 0.116 | 0.062 | 0.980 | 4.22 | 0.960 | |

SGD | M13 (7,1,1) | 0.299 | 0.178 | 0.969 | 11.87 | 0.896 | 0.271 | 0.147 | 0.974 | 11.17 | 0.906 |

M14 (7,2,1) | 1.440 | 2.932 | 0.556 | 59.10 | 0.001 | 1.455 | 2.883 | 0.667 | 61.79 | 0.002 | |

M15 (7,3,1) | 0.355 | 0.238 | 0.959 | 14.32 | 0.877 | 0.330 | 0.198 | 0.965 | 13.87 | 0.886 | |

M16 (7,4,1) | 0.285 | 0.157 | 0.973 | 11.23 | 0.901 | 0.257 | 0.130 | 0.977 | 10.47 | 0.911 | |

M17 (7,5,1) | 0.268 | 0.143 | 0.975 | 10.54 | 0.907 | 0.242 | 0.122 | 0.979 | 9.79 | 0.916 | |

M18 (7,6,1) | 0.318 | 0.195 | 0.966 | 12.70 | 0.889 | 0.292 | 0.161 | 0.972 | 12.06 | 0.899 | |

M19 (7,7,1) | 0.333 | 0.211 | 0.963 | 13.35 | 0.885 | 0.307 | 0.178 | 0.969 | 12.81 | 0.894 | |

M20 (7,8,1) | 0.379 | 0.264 | 0.954 | 15.15 | 0.869 | 0.355 | 0.222 | 0.961 | 14.74 | 0.877 | |

M21 (7,9,1) | 0.288 | 0.159 | 0.972 | 11.25 | 0.900 | 0.257 | 0.130 | 0.977 | 10.42 | 0.911 | |

M22 (7,10,1) | 0.279 | 0.152 | 0.974 | 10.84 | 0.903 | 0.249 | 0.126 | 0.978 | 10.02 | 0.914 | |

M23 (7,11,1) | 1.440 | 2.934 | 0.198 | 59.10 | 0.000 | 1.455 | 2.884 | 0.181 | 61.80 | 0.000 | |

M24 (7,12,1) | 1.440 | 2.934 | 0.031 | 59.12 | 0.000 | 1.455 | 2.885 | 0.003 | 61.82 | 0.000 | |

Adam | M25 (7,1,1) | 0.212 | 0.114 | 0.980 | 8.99 | 0.927 | 0.204 | 0.110 | 0.981 | 8.72 | 0.929 |

M26 (7,2,1) | 0.167 | 0.085 | 0.985 | 7.35 | 0.942 | 0.167 | 0.087 | 0.985 | 7.28 | 0.942 | |

M27 (7,3,1) | 0.171 | 0.092 | 0.984 | 7.51 | 0.940 | 0.172 | 0.088 | 0.985 | 7.52 | 0.941 | |

M28 (7,4,1) | 0.194 | 0.111 | 0.981 | 8.46 | 0.933 | 0.187 | 0.105 | 0.982 | 8.27 | 0.935 | |

M29 (7,5,1) | 0.166 | 0.082 | 0.986 | 7.18 | 0.942 | 0.162 | 0.083 | 0.986 | 6.97 | 0.944 | |

M30 (7,6,1) | 0.211 | 0.126 | 0.978 | 9.42 | 0.927 | 0.211 | 0.123 | 0.978 | 9.47 | 0.927 | |

M31 (7,7,1) | 0.434 | 0.500 | 0.911 | 14.37 | 0.849 | 0.413 | 0.421 | 0.924 | 14.14 | 0.857 | |

M32 (7,8,1) | 0.164 | 0.081 | 0.986 | 7.22 | 0.943 | 0.162 | 0.082 | 0.986 | 7.10 | 0.944 | |

M33 (7,9,1) | 0.171 | 0.076 | 0.987 | 7.08 | 0.941 | 0.171 | 0.087 | 0.985 | 6.97 | 0.941 | |

M34 (7,10,1) | 0.196 | 0.098 | 0.983 | 8.31 | 0.932 | 0.195 | 0.102 | 0.982 | 8.26 | 0.932 | |

M35 (7,11,1) | 0.224 | 0.183 | 0.968 | 7.18 | 0.922 | 0.206 | 0.147 | 0.964 | 6.75 | 0.929 | |

M36 (7,12,1) | 0.318 | 0.346 | 0.939 | 9.39 | 0.890 | 0.293 | 0.281 | 0.948 | 8.81 | 0.899 |

The sigmoid and SGD as activation functions and solver performed more poorly in the daily ET_{0} forecasting compared to the combination of sigmoid and L-BFGS-B. The model M17 having ANN structure as (7,5,1) performed best compared to all 12 models developed during training and testing periods. At the same time, model M24 (7,12,1) performed worst in forecasting daily ET_{0}. The values of the MAE, MSE, *r*, MAPE, and *d* for the M17 model obtained were 0.268, 0.143, 0.975, 10.54%, and 0.907, respectively, during the training period and 0.242, 0.122, 0.979, 9.79%, and 0.916, respectively, during the testing period. The M24 model reported that the performance indices, namely, MAE, MSE, *r*, MAPE, and *d* resulted in values of 1.440, 2.934, 0.031, 59.12%, and 0.00, respectively, during the model training period and 1.455, 2.885, 0.003, 61.82, and 0.000, respectively, during the model testing period. The comparison between the minimum and maximum number of neurons in the hidden layer showed that M14, M23, and M24 models performed more poorly in forecasting daily ET_{0}. Overall, we observed that the performance of the ANN using sigmoid activation function with a SGD solver to forecast daily ET_{0} had showed the effects of the number of neurons on the performance significantly.

The performance of models developed using sigmoid as an activation function and Adam as a solver was improved during the testing period over the training period, as indicated by the performance indicator values in Table 2. Among all the 12 combinations of the ANN structure-based models, model M32 (7,8,1) performed the best. The statistical performance indices, namely, MAE, MSE, *r*, MAPE, and *d*, reported values of 0.164, 0.081, 0.986, 7.22%, and 0.943 during the training period and 0.162, 0.082, 0.986, 7.10%, and 0.944, respectively, during the testing period. On the other hand, model M31 (7,7,1) performed worst among all the 12 models as reported by the low values of statistical performance indices. The performance of all 36 models is presented in Table 2. The values of MAE, MSE, *r*, MAPE, and *d* for model 31 were found to be 0.434, 0.500, 0.911, 14.37, and 0.849, respectively. The same during the testing period were 0.413, 0.421, 0.924, 14.14, and 0.857, respectively. All the remaining models gave the values of statistical performance indices better than model M31 and poorer than model M32. The increment of the number of neurons from 1 to 12 was done to determine the best model. It was found that the ANN functioned differently and there was no incremental improvement in the performance indicator values while increasing the number of neurons in the hidden layer. Therefore, among 12 models developed using sigmoid as an activation function and Adam as a solver, model M32 was found to be best for forecasting daily ET_{0}.

We found that among all the three solvers with sigmoid activation function, the performance of the models developed using a L-BFGS-B solver gave best results and thus we recommend employing the ANN model with sigmoid function and an L-BFGS-B solver to forecast daily ET_{0} under semi-arid climatic condition.

_{0}, and data points above the best fit line (1:1 line) are the over forecasted ET

_{0}values. It is evident from Figure 7 that the majority of the data points were lying within the forecasting confidence limit (±5) of the regression line.

#### Performance of KNN models

*k*) values to forecast the ET

_{0}at the IARI, New Delhi, is presented in Table 3. The trial-and-error approach identified the

*k*-value for the best ET

_{0}forecasting model. The KNN model with a neighbor (

*k*) value of 1 yielded an absolute value of performance indices during training and unacceptable values during model testing. Thus, model

*k*1 was rejected for subsequent analysis. Therefore, only four developed models have been evaluated and discussed. It was observed that as the number of neighbors increased from 1 to 5, the performance of KNN improved, and further iteration using a higher

*k-*value (over 5) resulted in a decrease in the KNN performance in forecasting the daily ET

_{0}. Thus, the

*K-*value was restricted to 5 to avoid overfitting of the developed models. Interestingly, increasing the

*k-*value from 1 to 5 has decreased the KNN's performance during training while improving the performance during testing. Model

*K*1 yielded the best performance indicators during training and model

*K*4 during testing. The values of performance evaluation indices, namely, MAE, MSE,

*r*, MAPE, and

*d*, were 0.184, 0.065, 0.989, 5.95%, and 0.936, respectively, during the training period for model

*K*4. While during the testing of model

*K*4, the values of MAE, MSE,

*r*, MAPE, and

*d*were obtained as 0.217, 0.103, 0.964, 6.97%, and 0.962, respectively. Therefore, it can be recommended to use model

*K*4 for forecasting daily ET

_{0}values among the four developed models. The line diagram and scatter plot of the best KNN model M4 for the training and testing periods are shown in Figure 8.

Model . | K
. | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE (%) . | d
. | MAE . | MSE . | r
. | MAPE (%) . | d
. | ||

K1 | 2 | 0.145 | 0.038 | 0.993 | 4.68 | 0.950 | 0.240 | 0.122 | 0.958 | 7.81 | 0.958 |

K2 | 3 | 0.166 | 0.051 | 0.991 | 5.37 | 0.943 | 0.226 | 0.112 | 0.961 | 7.26 | 0.960 |

K3 | 4 | 0.176 | 0.059 | 0.990 | 5.69 | 0.939 | 0.220 | 0.105 | 0.963 | 7.11 | 0.961 |

K4 | 5 | 0.184 | 0.065 | 0.989 | 5.95 | 0.936 | 0.217 | 0.103 | 0.964 | 6.97 | 0.962 |

Model . | K
. | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE (%) . | d
. | MAE . | MSE . | r
. | MAPE (%) . | d
. | ||

K1 | 2 | 0.145 | 0.038 | 0.993 | 4.68 | 0.950 | 0.240 | 0.122 | 0.958 | 7.81 | 0.958 |

K2 | 3 | 0.166 | 0.051 | 0.991 | 5.37 | 0.943 | 0.226 | 0.112 | 0.961 | 7.26 | 0.960 |

K3 | 4 | 0.176 | 0.059 | 0.990 | 5.69 | 0.939 | 0.220 | 0.105 | 0.963 | 7.11 | 0.961 |

K4 | 5 | 0.184 | 0.065 | 0.989 | 5.95 | 0.936 | 0.217 | 0.103 | 0.964 | 6.97 | 0.962 |

### Sensitivity analysis of the ANN model

_{0}performance. Each input feature, namely,

*T*min,

*T*max, WS, SR, SS, RHI, and RHII, was omitted once to check its influence on the ET

_{0}forecasting ability of the ANN model based on the performance evaluation indices. Table 4 shows the influence of the skipped input variable on the performance index values. The kite diagram has been plotted for the training and testing period to visualize the effect of each skipped input variable on the ANN model M11 (7,11,1) performance to forecast daily ET

_{0}(Figure 9). In general, skipping each variable showed a decrease in the model's performance to forecast ET

_{0}based on the performance index values. However, the most critical input variable influencing the model's performance was WS during training and testing. The influence of skipping parameters on the ET

_{0}forecasting was followed by SR,

*T*max, SS, RHI, RHII, and

*T*min, respectively, during the training period of the M11 model.

Input variable omitted . | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE . | d
. | MAE . | MSE . | r
. | MAPE . | d
. | |

Tmin | 0.117 | 0.037 | 0.994 | 3.960 | 0.959 | 0.14 | 0.068 | 0.988 | 4.767 | 0.952 |

Tmax | 0.183 | 0.073 | 0.987 | 6.033 | 0.936 | 0.20 | 0.098 | 0.983 | 6.753 | 0.931 |

WS | 0.317 | 0.216 | 0.963 | 9.609 | 0.890 | 0.27 | 0.207 | 0.963 | 7.922 | 0.907 |

SR | 0.250 | 0.100 | 0.983 | 8.522 | 0.913 | 0.23 | 0.141 | 0.975 | 8.245 | 0.920 |

SS | 0.160 | 0.058 | 0.990 | 5.462 | 0.944 | 0.16 | 0.081 | 0.986 | 5.058 | 0.946 |

RHI | 0.153 | 0.054 | 0.991 | 4.962 | 0.947 | 0.14 | 0.072 | 0.987 | 4.628 | 0.951 |

RHII | 0.126 | 0.040 | 0.993 | 4.338 | 0.956 | 0.14 | 0.130 | 0.977 | 4.760 | 0.952 |

Input variable omitted . | Training . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|

MAE . | MSE . | r
. | MAPE . | d
. | MAE . | MSE . | r
. | MAPE . | d
. | |

Tmin | 0.117 | 0.037 | 0.994 | 3.960 | 0.959 | 0.14 | 0.068 | 0.988 | 4.767 | 0.952 |

Tmax | 0.183 | 0.073 | 0.987 | 6.033 | 0.936 | 0.20 | 0.098 | 0.983 | 6.753 | 0.931 |

WS | 0.317 | 0.216 | 0.963 | 9.609 | 0.890 | 0.27 | 0.207 | 0.963 | 7.922 | 0.907 |

SR | 0.250 | 0.100 | 0.983 | 8.522 | 0.913 | 0.23 | 0.141 | 0.975 | 8.245 | 0.920 |

SS | 0.160 | 0.058 | 0.990 | 5.462 | 0.944 | 0.16 | 0.081 | 0.986 | 5.058 | 0.946 |

RHI | 0.153 | 0.054 | 0.991 | 4.962 | 0.947 | 0.14 | 0.072 | 0.987 | 4.628 | 0.951 |

RHII | 0.126 | 0.040 | 0.993 | 4.338 | 0.956 | 0.14 | 0.130 | 0.977 | 4.760 | 0.952 |

Conversely, during the testing period of the sensitivity analysis, WS emerged as the most influential input variable for ET_{0} forecasting. Furthermore, the influence of the input variables was followed by SR, *T*max, RHII, SS, RHI, and *T*min, respectively, in decreasing order. Therefore, it may be concluded that the WS is the most influential input variable and its omission in forecasting ET_{0} using the developed ANN model M11 reduces the model performance during training and testing periods. Overall, the performance of the ANN model M11 outperformed all the ANN- and KNN- based models, and therefore, model M11 may be used to forecast daily ET_{0} at the IARI, New Delhi.

### Comparison with previous studies

Many researchers have employed different machine learning algorithms to forecast ET_{0}. In many cases, a single algorithm based on ET_{0} forecasting was better, and in some cases, meta-heuristic algorithms performed better than single algorithms. For example, the deep neural network (DNN) model was superior in forecasting daily ET_{0} based on a single input meteorological variable compared to RF and Extreme Gradient Boosting (XGBoost) algorithms in California (Ravindran *et al.* 2021). Improved machine learning algorithms such as the SVM-cuckoo algorithm (SVM-CA) outperformed genetic programming (GP), model tree (M5T), and the adaptive neuro-fuzzy inference system (ANFIS) in forecasting ET_{0} at Ranichuri, Uttarakhand, India (Ehteram *et al.* 2019). In another study, the ANN model with the Levenberg–Marquardt training algorithm, single hidden layer, and nine neutron configurations were found to be the best for ET_{0} forecasting in Dehradun, India (Nema *et al.* 2017). This study's findings align with other study results. Our study showed that all the meteorological variables are essential to forecast ET_{0}. However, Alves *et al.* (2017) reported that air temperature alone could be sufficient to forecast daily ET_{0} accurately using the ANN algorithm under a data-limited scenario.

In comparison to Hargreaves and the FAO-PM methods, the ANN model has shown to be a reliable choice for estimating ET_{0} using *T*max and *T*min for Salinas, California, United States of America, providing superior performance index, standard error of estimate, and correlation (Lucas *et al.* 2008). In forecasting ET_{0}, the performance of the KNN model was better than the SVM, GNB, and ANN models with 92% accuracy in the Multan region, Pakistan (Hu *et al.* 2022). However, the findings of our study contradict the findings reported by Hu *et al.* (2022). This could be due to various factors, including agro-climatic conditions, algorithm structure, and the activation function used in the ANN techniques. The activation function selection and the learning algorithm influenced the ANN performance. The influence of the ANN activation function and the architecture in forecasting daily ET_{0} was assessed for Nissouri Creek, Oxford County, Canada (Walls *et al.* 2020). Study results showed that the SGD learning algorithm for 500 training epochs had performed better compared to a combination of ReLU and root-mean-square-propagation (RMSprop) learning algorithms in forecasting daily ET_{0} for Nissouri Creek, Oxford County, Canada (Walls *et al.* 2020).

The ANN-based model showed higher efficacy in modeling daily ET as compared to kNN, RF, SVM, XGBoost, and LSTM ML as evidenced by the low RMSE (ranged between 18.67 and 21.23) and stronger *r* (ranged between 0.90 and 0.94) values for global cropped lands (Liu *et al.* 2021). The findings of the present study are in line with the study (Liu *et al.* 2021). The performance of the ANN model for drought-prone arid and semi-arid regions of Corum, Turkey proved to be better in ET_{0} estimation compared to GPR, SVR, and LSTM approaches with full and partial dataset conditions (Sattari *et al.* 2021). The efficacy of the machine learning models is influenced by the data availability. For instance, Yamaç (2021) found that the kNN was superior to the ANN in modeling daily potato crop evapotranspiration under limited meteorological dataset conditions at the Mediterranean Agronomic Institute of Bari (CIHEAM-Bari), Valenzano, Southern Italy. However, under a full dataset scenario, the ANN model gave slightly better statistical indicators than the kNN. Thus, our findings are in line with the study (Yamaç 2021). The performance of the KNN was assessed to forecast daily ET_{0} in the Ningxia irrigation area, China under a limited dataset scenario (Feng & Tian 2021). They reported that the KNN-based ET_{0} forecast model (*K* = 3) performed better than other nearest-neighbor (*k*) values. Based on the different model's efficacy in forecasting daily ET_{0}, they found that the KNN-based ET_{0} forecast model which requires *T*max, *T*min, and RH as input variables had the best accuracy. They concluded that the KNN-based prediction were consistent with the PM model and recommend applying it in semi-arid environments.

The performance of the ANN model is highly dependent on its structure, i.e., the number of variables, the number of hidden layers, and the number of neurons in the hidden layers. The daily evaporation was forecasted by Singh *et al.* (2021) and reported that the ANN with structure (4-9-1) was better than other structures and outperformed the MLR model. Many neighbors (*k*) in the KNN algorithm influenced its ET_{0} forecasting capability, as reported by Feng & Tian 2021. They found that input variables in the KNN-based ET_{0} forecast model, which requires *T*max, *T*min, and RH, resulted in higher forecasting accuracy. The optimal number of neighbors found was 3 (*k* = 3) for ET_{0} forecasting in Ningxia, China. The findings of different algorithms in forecasting ET_{0} vary under different climatic conditions. Therefore, under the prevailing climatic conditions of the study region, the performance of the ANN models was better compared to the KNN models.

## CONCLUSION

Irrigation water management at the field level requires an accurate estimation of crop water requirements, for which the reference evapotranspiration is the prerequisite. Soft computing techniques like ANN and KNN are easy alternatives that can be developed using meteorological parameters. Furthermore, comparing advanced soft computing tools with standard methods enables the user to change the structure and algorithms to suit a particular climatic condition. The best selected ANN (M11) model's performance was superior to the KNN models. This study also demonstrated that the hidden layer structure influenced the performance of the ANN model in forecasting daily ET_{0}. The performance indices for model M11 were superior to others with MAE, MSE, *r*, MAPE, and *d* as 0.075, 0.018, 0.997, 2.76%, and 0.974, respectively, during model training and 0.091, 0.053, 0.984, 3.16%, and 0.969, respectively, during model testing. Sensitivity analysis of the best selected model revealed that the WS influenced the model's performance and, thus, an essential variable for daily ET_{0} forecasting. Therefore, it can be recommended that the model M11 (7,11,1) with logistic activation function and L-BFGS-B learning algorithm can be utilized for ET_{0} forecasting at the IARI, New Delhi. The research conclusions reported here provide irrigation managers and cultivators with fundamental guidance for forecasting ET_{0} for irrigation planning and managing water resources in the context of available information.

The present study considered the full meteorological dataset availability condition. However, such a dataset may not be available in all the regions especially in the ungauged watershed. Therefore, under such a meteorological data limitation scenario, the developed models may not perform accurately enough to predict daily ET_{0}. Hence, the ANN and KNN techniques required to be tested depending on the meteorological data availability. Also, the performance of the soft computing techniques in forecasting daily ET_{0} may be significantly influenced by the selection of algorithm, study location, dataset size, and partitioning of the dataset. Therefore, we suggest evaluating the application of the ANN and KNN techniques under different climatic conditions with varying scenarios of data availability and data partitioning to arrive to concrete conclusions for their application in forecasting daily ET_{0} in other climatic conditions.

## ACKNOWLEDGEMENTS

The authors are thankful to the Division of Agricultural Physics, ICAR-IARI, New Delhi, for providing meteorological data.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*FAO irrigation and drainage paper No. 56*. Food and Agriculture Organization of the United Nations, Rome. pp.

*Agricultural Water Management*

**81**(1–2), 1–22. https://doi.org/10.1016/j.agwat.2005.03.007.

*Staphlococcus lentus*inoculations of plants as a promising strategy used to attenuate chromium toxicity

**14**,13056.

*Global Science and Technology*

**11**(03), 229–240.

*Vigna mungo*L.) in sub-humid region

*International Journal of Agricultural and Biological Engineering*

**4**(4), 50–58.