An effective evapotranspiration estimation scheme based on statistical indicators for sustainable environments in humid and semi-arid area of Khyber Pakhtunkhwa, Pakistan

Reference evapotranspiration (ETo) is critical for irrigation design and water management in rainfed and irrigated agriculture. The PenmanMonteith (FAO-56(PM)) equation was demonstrated to be the most reliable and adaptive to a wide range of humid to semi-arid climates. However, it requires several environmental parameters (e.g., wind speed, solar radiation), rarely available in developing countries. Therefore, numerous temperature-based formulas have been designed to address this issue for various environments. Their calibration and validation against the local climate frequently lead to increases in performance. We revised the Hargreaves exponent (EH) and substituted a value of (0.16) for the original value (0.5). The modified Hargreaves formula enhances the ETo predictions with a mean absolute error ranging from (0.791) mm per day for Balakot to (2.36) mm per day in Risalpur, averaging (3.797) mm per day, as compared to the Hargreaves-Samani (16.827) mm per day. In general, all the selected models showed high accuracy. However, the modified Hargreaves equation appeared to be the most promising results. It ranked first in (50%) of the whole area based on the standard error of estimate for estimating ETo in Khyber Pakhtunkhwa. Additional research must be conducted to determine the study’s relevance to other regions.


INTRODUCTION
Water scarcity and crises are significant difficulties and challenges in humid, semi-arid, and desert environments. Rapid population growth, global climate change, and drought add to the global water and food security challenges. According to the United Nations World Water Development Report 2019: Leaving No One Behind. Paris, UNESCO (Water, 2019), agriculture utilizes around 70% of global freshwater availability as the largest consumer of total water resources. Thus, sustained measures should be taken to solve water and food security concerns by applying suitable adaptation strategies and plans. Reference evapotranspiration (ET o ) estimates are frequently used in hydrology, environmental issues, climate change, irrigation, and drainage designing. ET o has been used to coordinate water crisis management strategies, agriculture water conservation plans, and water resource operations. A hydrologic assessment is essential for irrigation administration and developing an intelligent irrigation system anywhere on the earth. Accurate figures of reference or crop evapotranspiration (ET o , ET c ) allow for the efficient use of available water resources, such as stored water for the dry season. To calculate (ET c ), first, calculate (ET o ) and then multiply it by the crop coefficient Kc proposed in (Allen et al., 1998) and (Doorenbos et al., 1997). Practically, the essential classical techniques for assessing ET, such as the eddy correlation system, soil water balance, and Bowen ratio, are only available at the field scale (Tao et al., 2018). The ET o value can be explicitly or implicitly determined using statistical models. However, some drawbacks restrict their implementation to fix predictive models, particularly at spatial scales (Kumar et al., 2012;Jing et al., 2019). Apart from that, field systems are expensive, time-consuming, and challenging to maintain. ET o is also difficult to be quantified because it is determined by the interaction of several meteorological factors, including humidity, temperature, radiation, and wind speed (Tao et al., 2018). Furthermore, Penman-Monteith (FAO-56(PM)) equation was recommended by the United Nations International Commission for Irrigation, Drainage, Food, and Agriculture Organization as a correct protocol for estimating reference evapotranspiration and the performance assessment of many other formulas (Allen et al., 1998). Likewise, the FAO-56(PM) method is the preferable way of estimating ET o accurately if lysimeter data is unavailable (Irmak et al., 2003;Utset et al., 2004;Gavilán et al., 2006;Nandagiri & Kovoor, 2006;Trajkovic, 2007). The FAO-56(PM) estimation method uses relative humidity, air temperature, solar radiation, and wind speed. Numerous researchers demonstrated that estimating irrigation water requirements is critical for the proper operation of a sophisticated irrigation process in various places around the world (Allen et al., 1998;Tabari & Talaee, 2011;Thepadia & Martinez, 2012;Tabari et al., 2013;Çakir et al., 2017;Gul et al., 2021). In addition, ET o estimates are frequently used to develop, supervise, plan, and perform water assets in hydrology, agriculture water resources, irrigation, and drainage engineering (Gul et al., 2021;Valipour, 2014Valipour, , 2015. On the other hand, only a small proportion of the weather stations have a comprehensive set of essential parameters. If the number of meteorological stations with data for the key parameters is limited, we cannot estimate ET o using FAO-56 (PM). However, depending on the weather, location, and local conditions of the area certain factors are more influential than others in computing ET o (Debnath et al., 2015;Gupta et al., 2020). Under controlled circumstances, different authors make the argument that it is vital to identify the climatic parameters most sensitive to ET o so that assessments of those variables can be used to develop simple empirical ETo models (Koudahe et al., 2018;Djaman et al., 2019). Classifying and reducing the impact of climate change also involves defining sensitive variables (Nouri et al., 2017;Chatterjee et al., 2021).
In comparison, several studies have verified that perhaps the Hargreaves-Samani (HS) temperature-based formula can reliably estimate evapotranspiration for five days or longer (Jensen et al., 1997;Droogers & Allen, 2002;Hargreaves & Allen, 2003). They recommended using the HS equation when sufficient data like solar radiation, relative humidity, and wind speed are limited to solve the FAO-56(PM) equation. However, the HS equation usually overestimates ET o at humid locations (Jensen et al., 1990;Itenfisu et al., 2003;Temesgen et al., 2005;Trajkovic, 2007;Azhar et al., 2014;Valipour, 2017). Likewise, various investigations have indicated that ET o is significantly underestimated in drylands and semi-arid territories with diverse ecosystems (Khoob, 2008;Benli et al., 2010;Azhar & Perera, 2011). Similarly, another researcher explored HS equation coefficients' recalibration with new factors [C H ¼0.0030, E H ¼0.4, HT¼20, and C H ¼0.0025, E H ¼0.5, and HT¼16.8], the performance of the HS equation has not improved significantly (Droogers & Allen, 2002). Another researcher recommends correlating the (C H ) and the average temperature range (Samani, 2000). Indeed, one more investigator demonstrated that the Samani approach modification did not affect consistency (Xu & Singh, 2002). Consequently, updating the HS equation coefficients according to the surrounding environment is an alternative to enhance its estimate.
In particular, (Amatya et al., 1995), a monthly modified (C H ) was developed in Eastern North Carolina for three weather stations. Analogously, the unique (C H ) value was imposed for Rawson Lake station in northwest Ontario, Canada (C H ¼0.0029), as recommended by (Xu & Singh, 2002). It was asserted (Spain (C H ¼0.0020)) for non-windy areas in Spain's semi-arid middle Ebron River Valley region by (Martınez-Cob & Tejero-Juste, 2004). A linear relationship between modified (C H ) and the fraction of the mean temperature that falls within the low daily temperature range is implied for each weather station in Andalusia, Southern Spain (Vanderlinden et al., 2004).
Furthermore, the modified HS formula overestimates the FAO-56 using the simple linear regression algorithm for the HS model (Trajkovic, 2005). In semi-arid southern Spain, the HS formula was validated using data from 86 monitoring stations, comparing daily estimates to the FAO56PM guidelines (Gavilán et al., 2006). According to the research, regional calibration must be performed solely based on temperature and wind circumstances. Additionally, since most ETo models were calibrated and validated in temperate environments, local calibration and validation of ET o models are more critical in semiarid and arid regions than in temperate climate zones (DehghaniSanij et al., 2004;Mohawesh & Talozi, 2012). The above consequences encouraged the HS equation's local calibration through the HS equation's local adjustment exponent. The researchers generally use the well-known ET o estimation method because they trust it provides excellent results in diverse regions. However, their accuracy is highly sensitive to climate. There are methods to predict air pollution that fully assimilate environmental protection work and encourage environmental monitoring efficiency (Li et al., 2018;Wang et al., 2019;Lu et al., 2020).
Numerous regions of the eastern Mediterranean, including Pakistan, are semi-arid to arid. Irrigation consumes a disproportionate share of these countries' total water budgets (Mohawesh & Talozi, 2012). In these locations, metrological data is frequently insufficient or unavailable. As a result, it is crucial to evaluate the accuracy of techniques for calculating ETo from sparse data and HS equations in arid and semi-arid climates. Many investigators in the past also attained a performance evaluation of ET o estimation approaches in different parts of Pakistan by operating radiation and temperature-based methods (Rasul & Mahmood, 2009;Naheed & Rasul, 2010;Zahid & Rasul, 2012;Kumar et al., 2013;Azhar et al., 2014).
In this research paper, the main contribution is as follows; we explore the HS equation's calibration exponent on the weather stations of Khyber Pakhtunkhwa, Pakistan. An effort has been made to compare the accuracy of six widely used radiation and temperature-based procedures, including Hargreaves-Samani, modified Hargreaves, Turc, Makkink, and Priestly-Taylor, with the standard FAO-56(PM) under Pakistan's climatic conditions and identify the most appropriate method for approximating this essential agro-climatic and agro-hydrological parameter in Khyber Pakhtunkhwa. Similarly, a comparative analysis through the concept of Q-Q plot and correlation plot and histogram between a standard model, radiation-based procedures, and the proposed calibrated temperature-based method MHargreaves are also discussed in detail. The rest of this paper is organized as follows, a brief portrayal of selected models and methodology used for estimation of reference evapotranspiration is provided in the next section. The third section explains the calibrating system, the statistic indicator's performance assessment, and the study field and data details. The performance analysis of each station is presented in the fourth section. Next, the overall discussions and conclusions are shown in the fifth and sixth sections, respectively.

Selected Models for Estimation of Evapotranspiration
In this paper, we evaluated the performance evaluation of three radiation-based methods (Priestly-Taylor, Makkink, and Turc) and two temperature-based methods Hargreaves-Samani (HS), and modified Hargreaves (MHargreaves). The FAO-56(PM) is used as a benchmark for Hargreaves-Samai calibration. The performance evaluation of each method was checked by using statistical indicators such as Standard Error of Estimate (SEE), Coefficient of Residual Mass (CRM), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE). These estimation methods are usually used in irrigation engineering. In the FAO-56(PM) method, (ET o ) is computed by (Allen et al., 1998) as:

The Hargreaves-Samani Equation
The Hargreaves-Samani equation can be used as an alternative method, written as: where ET HS ¼ ET estimated by using the above HS equation (mm per day) R a ¼ extraterrestrial radiation (mm per day) T max ¼ daily or mean monthly maximum air temperature ( o C) T min ¼ daily or mean monthly minimum air temperature ( o C) .5, and HT¼17.8) by (Hargreaves, 1994).
The modified Hargreaves-Samani working formula was developed for grass conditions (Hargreaves, 1975;Hargreaves & Samani, 1985). It requires air temperature and solar radiation data, which is given as:

The Priestley-Taylor (PT) model
The PT model is a compact edition of the original Penman combination equation (Penman, 1948;Priestley & Taylor, 1972). The PT model was initially planned to be used in wet locations where the surface is usual (Priestley & Taylor, 1972;Jensen et al., 1990). The mathematical equation is: Where α ¼ 1.26 and other parameters defined in Eq. (1) above.

Makkink formula
Makkink proposed an equation for the Netherlands as an alternate Penman by comparing the Penman model with lysimetric data. The Makkink equation is presently well known and accepted in Western Europe (Makkink, 1957;Hargreaves & Allen, 2003). (Amatya et al., 1995) also successfully applied in America and can be given as: where, R s is the solar radiation [MJ m À2 day À1 ], g is the psychometric constant [kPa ( o C À1 )], D is the slope of the saturation vapor pressure-temperature curve [kPa ( o C À1 )], and l is latent heat of vaporization of water [MJ kg À1 ].

The Turc method
Turc recommended a simple equation for estimating ET o using only solar radiation, relative humidity, and mean temperature. The model was initially developed for Western Europe (Turc, 1961). Similarly, a well-known researcher also discussed that the Turc model is appropriate and is reliable in humid conditions (Jensen et al., 1990). The equation can be given as: When RH , 50% then, When RH . 50% then, Where RH is the relative humidity T mean is the mean temperature R s is the solar radiation.

Calibration procedure for modified Hargreaves Equation
Several famous research studies customized the Hargreaves equation using the same design (Amatya et al., 1995;Gavilán et al., 2006;Trajkovic, 2007;Subburayan et al., 2011). Therefore, the same approach was adopted for calibration of the Hargreaves equation in the present paper; according to the literature, the equation (3) Hargreaves equation (ET HS ) is set equal to FAO-56(PM), and the exponent value (0.5) was established as (d) to be determined from the estimated data. Hence, the new modified Hargreaves equation with modern exponent for Khyber Pakhtunkhwa is given by: We can write the new modified Hargreaves equation in the following form; We have calculated the values of C, X, and Y by using ten-year data set of mean monthly maximum and minimum air temperature, maximum and minimum relative humidity, wind speed, and solar radiation. The appropriate value of (d) is determined by using meteorological data (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009). The amount of the exponent d¼0.16 is found to be suitable for all stations. The modified Hargreaves equation is given as;

Study area and data description
The eight weather stations are selected for this study, located in the Khyber Pakhtunkhwa region. Balakot, Cherat, Chitral, Kakul, Parachinar, Peshawar, Risalpur, and Saidusharif (Gul et al., 2018). The Khyber Pakhtunkhwa province's economy is based on agriculture, even though the mountainous topography is unfavorable to extensive agronomy. Irrigation is carried out on about one-third of the province's cultivated land, where major crops are wheat, corn (maize), sugarcane, and tobacco. The mountain ranges experience cold winters and moderate summers, and temperatures rise significantly to the south. Precipitation over the province is capricious but averages roughly (400 mm) annually, with considerable occurring from January to April (Khan, 2010). The mountain slopes in the north region are full of natural scenic having five rivers, and exceptionally rich in charming and alluring landscape, exotic valleys, and condensed pine forest. The mean monthly air temperature, wind speed, and relative humidity data are collected at these stations under standard grass height and well-irrigated environment. The meteorological ten-year data (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) were obtained from the Pakistan Meteorological Department, Islamabad (Gul et al., 2020). The description of the different weather stations is presented in Table 1 and graphically displayed in Figure 1.

Statistical Analysis of collected Meteorological Data
The basic statistic of the collected climatic parameters such as precipitation, air temperature, wind speed, relative humidity, and solar radiation are shown in Table 2. The mean annual maximum and minimum temperature for Balakot is 25.70 C°and 11.65 C°, while the relative humidity is 70.65% and 66.34%, respectively. The mean annual wind speed (U 2 ) is 1.24 meters per second, and the mean annual solar radiation is 18.06. The mean annual extra-terrestrial radiation is 30.27 [MJm À2 day À1 ]. Furthermore, the minimum mean annual precipitation was observed in Chitral's concern region with 15.33 mm, while the maximum precipitation was confirmed (121.86) mm in Balakot station. Similarly, variances, skewness, and kurtosis are also shown in Table 2 to convey more information about the density and distribution shape of the study parameters. The rest of the research study locations statistics and details are shown in Table 2.   Tables 3 and 4 show the different statistical measures errors of five estimated reference evapotranspiration methods in Pakistan's KP region. Similarly, Figures 3 and 4 offer the Pearson product-moment correlation coefficient, commonly called correlation coefficient r between x and y, and the performance of different methods for estimating reference evapotranspiration for each station given in the following subsections.

Balakot Station
The over or underestimation performance is tested by statistical indicator CRM for each of the five selected models in this study at Balakot Station are given in Table 3. The coefficient of residual mass in Column 4 indicates the over or underestimation of included ET o models. It is noticeable that Priestly-Taylor, HS, and modified Hargreaves overestimated monthly mean  Figure 3. Thus, from the output, the correlation coefficient between our proposed calibrated MHargreaves and FAO-56(PM) is 91%, indicating a highly significant value level and encouraging modified Hargreaves in data deficient regions for the concern station. Similarly, from the output in Figure 3, the correlation coefficient between our proposed calibrated MHargreaves and FAO-56(PM) is 90% shows a high level of significant value. It encourages the use of modified Hargreaves

Kakul Station
In Kakul, three models (Priestly-Taylor, HS, and modified Hargreaves) with negative CRM values (À0.08, À2.04, À0.28) mm per day gains overestimated results. MAE values for HS and modified Hargreaves are (6.20, 0.87) mm per day, respectively, indicating modified Hargreaves' excellent performance against HS. Priestly-Taylor, Makkink, and modified Hargreaves showed low RMSE (0.50, 0.51, 1.02) mm per day versus HS and Turc with (6.7, 3.04) mm per day, respectively. If sufficient data is available, then radiation-based methods are suitable. Kakul station is humid; therefore, the proposed modified Hargreaves is a proper ET o estimation model. Next, according to the output in Figure 3, the correlation between our standard equation and Makkink is 99%, showing a high level of significant results. On the other hand, our proposed calibrated MHargreaves show a correlation of 94% with the standard FAO-56(PM). Thus the work persuades the use of our proposed calibrated MHargreaves equation in the study region.

OVERALL DISCUSSIONS
Various studies have been conducted in various regions of the globe to ascertain the optimised ETo model (Farzanpour et al., 2019;Jing et al., 2019;Khodayvandie et al., 2020;Salam et al., 2020;Goh et al., 2021;Sharafi & Ghaleni, 2021;Trajkovic & Gocic, 2021;Vishwakarma et al., 2022). Tables 3 and 4  Similarly, the modified Hargreavesequation showed satisfying results in ETo, compared with the standard FAO-56(PM). However, the Hargreaves-Samani equation overestimated (ET o ) values almost in all stations ranging from (À1.03) mm per day at Cherat to (À2.04) mm per day in Kakul. Moreover, the graphical representation of SEE estimates for different stations, including radiation-based and temperature-based models, is shown in Figure 4. Interestingly, the average monthly (ET o ) determined for ten years at the study area using FAO-56 (PM) mechanisms, HS, and MHargreaves can be seen in Figure 5. The results indicate overestimated ET o values in all locations. The modified Hargreavesequation estimates are very close to FAO-56(PM) estimates.
Additionally, the overall steps about the models and calibration are shown in the Data Flow chart in Figure 2. The graphical representation and comparative analysis of the modified Hargreavesequation presented in Figures 3 and 4 show that the proposed method results are much more reliable than HS and radiation-based models like Turc, Makkink, and PT. Furthermore, the Correlation between FAO-56(PM) with other models is shown in the upper off diagonals. The histograms and probability density curves for selected models are given in the diagonals. However, the lower off-diagonal showed the scatter plots for different ET o estimation models. At the same time, the ranking of the multiple equations has based on SSE values. Similarly, the best fitting reference evapotranspiration models are based on statistical indicators shown in Table 4.
Furthermore, the graphical representation of SEE estimates for the different stations is shown in Figure 6, based on Bayesian Kriging as an interpolation procedure. On the other hand, the comparison of ET o estimates through standard method FAO-56(PM), Hargreaves-Samani (HS), and modified Hargreaves are shown in Figure 5. However, Table 5 shows the SEE-based ranking of our proposed ESS Scheme. According to the results provided in the table, at Balakot, Priestly-   Figures 3 and 4 show the histogram, correlation coefficient, and Q-Q plot between our standard models FAO-56(PM) versus other reference evapotranspiration models during the whole study period in the entire KP region. The correlation coefficient between the benchmark method FAO-56(PM) and modified Hargreaves is high in all the stations, indicating the significant correlation between MHargreaves and standard . Similarly, the scatter plots for the modified Hargreaves and FAO-56(PM) are provided in Figure 7. Finally, the graphical abstract of the proposed study is provided in Figure 8. Hence, these results strongly stimulate that our proposed MHargreaves equation can be recommended as an alternative temperature-based method for ET o estimation in data deficient regions like Khyber Pakhtunkhwa, Pakistan.