Accurate estimation and reliable universal performance of reference evapotranspiration (ET0) obtained from a few meteorological parameters are important for the rational planning of agricultural water resources and the effective management of water in irrigated regions. Meteorological data in southern China were used to calculate ET0 using the standard Penman–Monteith formula and determined the core decision variable (hours of sunshine, N) and the limited decision variable (relative humidity, RH) using path analysis. Estimation models using an artificial neural network and wavelet neural network were established for the Wuhan and Guangzhou meteorological stations. The statistical indices were positively correlated with the decision contribution rates to ET0. The ET0 values for other stations in southern China were all estimated by these models, which were trained for the Guangzhou station, and then made a total comparison with Hargreaves–Samani (HS) and Priestley–Taylor (PT) empirical ET0 models. Error analysis indicated that the root mean square error and the mean absolute per cent error were around 0.32 mm and 5.5%, respectively, with a high coefficient of determination and Nash–Sutcliffe efficiency over 0.9, indicating that these estimating models could be applied in more regions for universal analysis with high accuracy.

INTRODUCTION

Reference evapotranspiration (ET0) is a critical parameter for calculating evapotranspiration and estimating actual crop water requirements and is widely used for choosing an irrigation system, optimizing crop planting structure, and enriching field water balance theory. Determining the characteristics of water consumption for a crop simultaneously could provide a reliable basis for the deployment of farmland water management and the allocation of agricultural water resources (Chen 1995; Temesgen et al. 2005; Trajkovic & Kolakovic 2009b; Chang et al. 2010; Kisi et al. 2012). The estimation and prediction of ET0 are thus very important for developing a reasonable system of field irrigation and for improving agricultural water management.

The Penman–Monteith (PM) equation is recommended by the United Nations Food and Agriculture Organization (FAO) and has been accepted universally for calculating ET0. The FAO defines ET0 for a hypothetical crop with an assumed height of 0.12 m, a surface resistance of 70 s m−1, and an albedo of 0.23, closely resembling the reference crop canopy evapotranspiration of an extensive surface of actively growing and adequately watered green grass of uniform height (Allan et al. 1998). The PM method had higher accuracy and wider applicability than the Hargreaves–Samani and Priestley–Taylor methods, but its application was limited by the difficulty of PM calculation, the required meteorological parameters, and the incompleteness of meteorological data collected by some small weather stations (Tabari & Talaee 2011; Ngongondo et al. 2013).

New estimation models have recently been proposed with the emergence of artificial neural network (ANN) technology (Kisi & Çimen 2009; Adeloye et al. 2012; Baba et al. 2013; Shiri et al. 2015a). The back propagation (BP) neural network model, currently the most mature and popular neural network, provides a powerful fault-tolerant and nonlinear approximation capability for calculation, simulation and estimation. This model has been widely used for calculating and estimating ET0 (Kumar et al. 2002; Cui et al. 2005; Khoob 2008). The meteorological parameters that are most influential during the establishment of a network model should be chosen as the inputs of the network; recent studies suggest about four parameters (Landeras et al. 2008; Dai et al. 2009; Traore et al. 2010; Shiri et al. 2011; Huo et al. 2012). Model concision and universal application, however, cannot be fully implemented using these many parameters, so fewer (one or two) decisive meteorological parameters should be used to estimate ET0 for providing a reliable theoretical basis for real-time estimation and application, especially in developing countries and regions which suffer from lack of instruments and sensors (Shiri et al. 2014, 2015b).

Hence, the development of more efficient ET0 estimating models is now of great importance when only few climatic data are available (Shiri et al. 2013). Many mathematical methods have been used to select the determining parameters, such as regression, correlation, sensitivity and trend analyses (Beven 1979; Huo et al. 2004; Verstraeten et al. 2005; Cao et al. 2007; Nova et al. 2007; Dinpashoh et al. 2011; Espadafor et al. 2011; Zhang et al. 2012; Li et al. 2013; Talaee et al. 2014; Tan et al. 2015). Many meteorological parameters, however, are strongly correlated with ET0 and are not completely independent of each other. Regression equations or empirical formulae with fewer variables can thus easily be ineffective when analysing data using the least squares method and so are less reliable and convincing. Path analysis can identify the direct and indirect effects from independent and dependent variables and identify the most highly interacting influences from among all parameters than can a simple correlation analysis (Tao et al. 2013; Sun et al. 2014; Ambachew et al. 2015; Zhang et al. 2016).

On the other hand, the wavelet analysis provides a useful method to decompose the observed available data, in terms of both time and frequency (Daubechies 1990). Partal (2009) utilized wavelet transform to decompose and reconstruct the climate data for ET0 estimation. Wavelet analysis, however, focuses on determining the required components of each selected climate factor instead of choosing the core parameters from all meteorological factors, which is different from path theory. Furthermore, the established models based on wavelet analysis are difficult to popularize, largely attributed to heavy workload and difficult consistent-component decision.

In fact, many studies have analysed the selection of meteorological parameters and the estimation accuracy during the establishment of neural network models for ET0 estimation, but most were suitable only for a limited area and were unable to obtain effective estimates when the study area expanded. The universal analysis of ET0 estimation based on an ANN model is extremely necessary. Moreover, wavelet neural network (WNN) is a new kind of network that combines the classic ANN and the wavelet analysis, which hybridize the flexibility and learning abilities of the neural network (Zhang & Benveniste 1992; Hsieh et al. 2011; Ong & Zainuddin 2016; Sharma et al. 2016). In several recent studies, the application of the wavelet analysis coupled with neural network can improve the efficiency of the traditional ANN model (Chauhan et al. 2009; Falamarzi et al. 2014; Wang et al. 2015). It is valuable to establish the WNN model and compare it with the ANN model for ET0 estimation and promotion.

When the daily meteorological data from various capital cities in southern China for 1969–2010 had been prepared as the origin data, the study first calculated the reference ET0 values using the PM equations, applied path analysis for the ET0 values and various meteorological parameters, analysed the strength of the interactions between the meteorological parameters, and finally selected a few core decision parameters. Meteorological stations which had the same selected parameters were merged into a group, and a base station which had the most influential corresponding decision parameters was chosen from each group. Then, the ANN and WNN models could be established by investigating the estimation accuracy and reliability with the selected meteorological parameters, the data from each base station were used to train these neural network models, and the data from the other stations were used for the estimation and universal analysis. The estimation accuracy and reliability of these models were analysed and compared with those of some empirical equations in order to provide solid technical support for applying the model.

MATERIALS AND METHODS

Study area and data sources

China is divided into northern and southern regions by the Qinling Mountains and the Huaihe River. The southern region more urgently needs to improve land use and farming productivity due to its higher population density and smaller area of arable land. Meteorological data from various municipalities and capital cities in the south were selected for the study to raise the level of agricultural development in this region (Figure 1). Summer (June–August) meteorological data for 1969–2010 were obtained from the website of the China Meteorological Data Sharing Service System (http://cdc.nmic.cn/home.do). The data included daily hours of sunshine (N, h), average temperature (Tmean, °C), maximum temperature (Tmax, °C), minimum temperature (Tmin, °C), relative humidity (RH, %) and wind speed (U2, m s−1).

Figure 1

Distribution of meteorological stations.

Figure 1

Distribution of meteorological stations.

Reference evapotranspiration

The PM equations are based on the energy balance and water vapour diffusion theories and so provide a better theoretical basis and much higher accuracy than other methods for calculating ET0 and take into account both the physiological characteristics of crops and the changes in aerodynamic parameters (Allan et al. 1998). The PM FAO-56 formula, recommended after years of research and improvement, is: 
formula
1
where ET0 is the reference evapotranspiration calculated by the PM method (mm·d−1), Rn is the net radiation at the crop surface (MJ·m−2·d−1), G is the soil heat-flux density (MJ·m−2·d−1), T is the mean daily air temperature at a height of 2 m (°C), U2 is the wind speed at 2 m (m·s−1), es is the saturation vapour pressure (kPa), ea is the actual vapour pressure (kPa), Δ is the slope of the vapour-pressure curve (kPa·°C−1), and γ is the psychometric constant (kPa·°C−1).
In addition, some well-known empirical models were considered to compare the performance of ET0 calculated by the PM method. The Priestley–Taylor method (Priestley & Taylor 1972) simplified and adjusted the PM method which calculated ET0 without RH and U2. Moreover, the Hargreaves–Samani method (Hargreaves & Samani 1985) could compute ET0 only by the temperature data. These two equations are listed as follows: 
formula
2
 
formula
3
where ET0PT is the reference evapotranspiration estimated by the Priestley–Taylor method (mm·d−1), ET0HS is the reference evapotranspiration estimated by the Hargreaves–Samani method (mm·d−1), Ra is the daily extraterrestrial radiation (MJ·m−2·d−1), and λ is the latent heat of evaporation (MJ·kg−1).

Path analysis theory

Path analysis was first proposed in 1921 as a mathematical and statistical method by the geneticist Sewell Wright. Nowadays, the method is broadly used in agriculture and energy demands, revealing direct or indirect relationships between some morphological characters (Mokhtassi Bidgoli et al. 2006; Yu et al. 2012; Zhang et al. 2016). However, little information is available on the use of this technique to evaluate the affecting factors of ET0. Given the fact that all the meteorological variables are strongly correlated and ultimately lead to multi-collinearity, traditional trend and correlation analyses cannot quantify the interactions among the meteorological factors when filtering the suitable parameters. Path analysis is a standardized partial regression statistical technique of partitioning the correlation coefficients into direct and indirect effects, thus the direct effect and contribution of each factor to ET0 could be calculated.

In this study, only one dependent y is ET0 and some independents xi are the meteorological variables. Assuming that ri is the simple correlation coefficient of xi and y, rij is the simple correlation coefficient between xi and xj, then the canonical equations of path analysis (Stafford & Seiler 1986; Sarawgi et al. 1997) can be proposed as follows: 
formula
4
where Pi is the direct path coefficient, and represents the direct effects between xi and y; rijPj is the indirect path coefficient, and shows the indirect effects of xi on y through xj. Another important indicator is the decision contribution rate (Rdci=riPi), which expresses the direct contributions of xi on y. Hence, the key decisive parameters with the largest influence on ET0 can be accurately obtained via the above indicators P and Rdc.

Artificial neural network

ANN technology provides an alternative for estimating nonlinear systems including ET0, and the feed forward BP neural network is becoming the most mature and popular ANN. Many studies have proved that the one-hidden-layer ANN model can approximate any arbitrary precision continuous function, which is generally used for estimating ET0 in practical applications (Kumar et al. 2002; Landeras et al. 2008; Dai et al. 2009; Trajkovic & Kolakovic 2009a; Traore et al. 2010; Huo et al. 2012). The classical architecture of ANN applied to estimate ET0 is composed of three layers (Figure 2): the input layer where the meteorological data chosen by the above path analysis are introduced into the model; the output layer where the reference ET0 values calculated by the PM formulae are obtained; and the hidden layer where the network is learned and processed. The mathematical explanation of this model is given by the equations below. 
formula
5
 
formula
6

where i(1−n) is the ith input layer neuron and n is the number of input layer neurons; xi is the different meteorological factors of the input layer; k(1−m) is the kth hidden layer neuron and m is the number of hidden layer neurons; yk is the input vector of the hidden layer. In addition, P is the calculation output as ET0, and the number of output layer neurons is only 1. f is the transfer functions between the adjacent layers, including the sigmoid function f1(x) and the purelin function f2(x). The upper layer nodes and the lower layer nodes are connected by the weights Wik and Vk, and the thresholds θk and λ.

Figure 2

Basic architecture of ANN for estimating ET0.

Figure 2

Basic architecture of ANN for estimating ET0.

Overall, the two steps of forward propagation of a signal and the BP of the error are executed alternately in the BP neural network model using the iterative gradient-descent technique to gradually minimize the quadratic error function defined in Equation (7). Once the target error and training parameters are established, the network model can approximate a nonlinear continuous function with any arbitrary precision by continuously correcting the network weights and thresholds. 
formula
7

in which Pn is the computed output by this ANN model, and On is the observed value of ET0 calculated by the PM method, N is the number of training data sets.

Wavelet neural network

WNN is an ANN model constructed based on wavelet analysis. The typical architecture of WNN is shown in Figure 3. Compared with the classical ANNs, the neurons in the hidden layer are named wavelons and the activation function is a wavelet family instead of the conventional sigmoid function (Alexandridis & Zapranis 2013). A family of wavelets can be constructed from a cluster of ‘mother wavelets’ φ(x), which consists of the different ‘daughter wavelets’ φa,b(x) (described in Equation (8)) formed by dilation (a) and translation (b) (Chauhan et al. 2009). In this study, the Gaussian wavelet function defined by Equation (9) was chosen as the activation function, then the mathematical representation of the WNN model is given by Equation (10). 
formula
8
 
formula
9
 
formula
10

The training parameters of WNN models consist of the dilation factors ak, the translation factors bk, and the weight coefficients between the wavelet neurons and the input/output layer wik, vk. These parameters were computed and adjusted during the training process, and the optimized values could be obtained by the same quadratic error function represented in Equation (7).

Figure 3

Basic architecture of WNN for estimating ET0.

Figure 3

Basic architecture of WNN for estimating ET0.

Statistical indices

Five statistical indices, root mean square error (RMSE), mean absolute error (MAE), mean absolute per cent error (MAPE), Nash–Sutcliffe efficiency (NSE) and the coefficient of determination (R2), were selected to evaluate the efficiency of the alternative ANN models and these empirical ET0 equations: 
formula
11
 
formula
12
 
formula
13
 
formula
14
and 
formula
15
where M is the number of observed data points, Ei is ET0 obtained by the estimated models, Ci is ET0 calculated with the PM method, Ê is the average of the data arrays of Ei, and Ĉ is the average of the data arrays of Ci.

In addition, the linear regression equation y=ax+b was also introduced to calibrate the performance of these estimation models. Where y is the ET0 values calculated by the PM method, x is the ET0 values estimated by any other empirical method or ANN model; a is the slope and b is the intercept.

RESULTS AND ANALYSES

Selection of decisive meteorological parameters

The six meteorological factors, average temperature (Tmean), maximum temperature (Tmax), minimum temperature (Tmin), relative humidity (RH), wind speed (U2) and daily hours of sunshine (N), were strongly coupled, determining the direct influence of each parameter on ET0 was difficult. Thus, path analysis could be implemented to detect the critical parameters. The study first calculated the correlation coefficient (r) between ET0 and each meteorological parameter, then determined the direct and indirect effects of each parameter on ET0 using the canonical equations and mathematical formulae of path analysis, and finally obtained P and Rdc. The analyses of the meteorological data from the Wuhan and Guangzhou stations are presented in Tables 1 and 2, respectively.

Table 1

Path analysis between meteorological parameters and ET0 in summer at the Wuhan station

  Direct effect Indirect effect
 
 
Meteorological parameter r P Tmean Tmax Tmin RH U2 N Total Rdc 
Tmean 0.7585 0.1020 – 0.0220 0.0769 0.0914 0.0106 0.4556 0.6565 0.0773 
Tmax 0.8155 0.0231 0.0971 – 0.0667 0.0951 0.0106 0.5229 0.7924 0.0188 
Tmin 0.5391 0.0840 0.0934 0.0183 – 0.0657 0.0131 0.2646 0.4551 0.0453 
RH − 0.7441 − 0.1439 − 0.0648 − 0.0152 − 0.0384 – − 0.0065 − 0.4753 − 0.6002 0.1071 
U2 0.2438 0.1282 0.0084 0.0019 0.0086 0.0073 – 0.0894 0.1156 0.0313 
N 0.9542 0.7360 0.0631 0.0164 0.0302 0.0929 0.0156 – 0.2182 0.7023 
  Direct effect Indirect effect
 
 
Meteorological parameter r P Tmean Tmax Tmin RH U2 N Total Rdc 
Tmean 0.7585 0.1020 – 0.0220 0.0769 0.0914 0.0106 0.4556 0.6565 0.0773 
Tmax 0.8155 0.0231 0.0971 – 0.0667 0.0951 0.0106 0.5229 0.7924 0.0188 
Tmin 0.5391 0.0840 0.0934 0.0183 – 0.0657 0.0131 0.2646 0.4551 0.0453 
RH − 0.7441 − 0.1439 − 0.0648 − 0.0152 − 0.0384 – − 0.0065 − 0.4753 − 0.6002 0.1071 
U2 0.2438 0.1282 0.0084 0.0019 0.0086 0.0073 – 0.0894 0.1156 0.0313 
N 0.9542 0.7360 0.0631 0.0164 0.0302 0.0929 0.0156 – 0.2182 0.7023 
Table 2

Path analysis between meteorological parameters and ET0 in summer at the Guangzhou station

  Direct effect Indirect effect
 
 
Meteorological parameter r P Tmean Tmax Tmin RH U2 N Total Rdc 
Tmean 0.7707 0.0088 – 0.0508 0.0539 0.1146 − 0.0065 0.5491 0.7618 0.0068 
Tmax 0.7899 0.0553 0.0081 – 0.0432 0.1100 − 0.0147 0.5880 0.7346 0.0437 
Tmin 0.4723 0.0660 0.0072 0.0361 – 0.0763 0.0019 0.2847 0.4062 0.0312 
RH − 0.7039 − 0.1504 − 0.0067 − 0.0404 − 0.0335 – 0.0047 − 0.4775 − 0.5535 0.1058 
U2 0.0178 0.0867 − 0.0007 − 0.0094 0.0014 − 0.0081 – − 0.0522 − 0.0689 0.0015 
N 0.9719 0.8216 0.0059 0.0396 0.0229 0.0874 − 0.0055 – 0.1502 0.7985 
  Direct effect Indirect effect
 
 
Meteorological parameter r P Tmean Tmax Tmin RH U2 N Total Rdc 
Tmean 0.7707 0.0088 – 0.0508 0.0539 0.1146 − 0.0065 0.5491 0.7618 0.0068 
Tmax 0.7899 0.0553 0.0081 – 0.0432 0.1100 − 0.0147 0.5880 0.7346 0.0437 
Tmin 0.4723 0.0660 0.0072 0.0361 – 0.0763 0.0019 0.2847 0.4062 0.0312 
RH − 0.7039 − 0.1504 − 0.0067 − 0.0404 − 0.0335 – 0.0047 − 0.4775 − 0.5535 0.1058 
U2 0.0178 0.0867 − 0.0007 − 0.0094 0.0014 − 0.0081 – − 0.0522 − 0.0689 0.0015 
N 0.9719 0.8216 0.0059 0.0396 0.0229 0.0874 − 0.0055 – 0.1502 0.7985 

All meteorological parameters except U2 were significantly correlated with ET0 at the Wuhan and Guangzhou stations. The P of N was 0.7360 and 0.8216 and the Rdc of N was 0.7023 and 0.7985 for the Wuhan and Guangzhou stations, respectively, much higher than those for the other parameters. Thus, N was selected as the crucial decisive parameter affecting ET0. The P and Rdc of RH were the highest (in absolute values) among the other five meteorological parameters, and only the P of RH was negative. Then, RH would be chosen as the second decision parameter and as the limited decisive variable. These two parameters had the most influence on ET0, which provided a theoretical basis for the subsequent few-parameter estimation model.

Establishment of neural network models for ET0 estimation

The meteorological parameters Tmean, Tmax, Tmin, RH, U2 and N must all be used to calculate ET0 by the PM equations, but N and RH were the most significant variables identified by path analysis. This study used these two decisive parameters and the calculated ET0 values as input and output factors, respectively, using the neural network toolbox in MATLAB to establish the ANN and WNN models.

The above two estimated models with single (N) or double (N and RH) parameters were established and compared using the summer meteorological data and calculated ET0 values for the Wuhan and Guangzhou stations for 1969–2010. The number of nodes of the corresponding hidden layer was set at three by trial-and-error to simplify and extend the utility of these models with no loss of information or data regulation. The study used the data for 1969–2003 to train the corresponding parameters and the data for 2004–2010 to validate the above models with network structures of 1 × 3 × 1 and 2 × 3 × 1. The total amount of water consumed during a few days in actual agricultural production, but not the daily precise consumption, was required for irrigation. The daily ET0 was thus replaced by the 5-day average ET0. The estimated and calculated ET0 were then obtained. The scatterplots are shown in Figure 4, and the statistical error indices for the estimates are listed in Table 3.

Table 3

Comparison of ET0 estimated errors using the 5-day average at the Wuhan and Guangzhou stations

Station Input parameter Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE 
Wuhan N 0.7023 0.4082 (0.3933) 0.3243 (0.3095) 6.9087 (6.6338) 0.8362 (0.8479) 
N, RH 0.8094 0.3510 (0.2734) 0.2412 (0.2195) 5.4931 (5.0435) 0.9588 (0.9265) 
Guangzhou N 0.7985 0.2872 (0.2804) 0.2350 (0.2292) 5.4150 (5.2944) 0.8834 (0.8889) 
N, RH 0.9043 0.1550 (0.1526) 0.1183 (0.1168) 2.8949 (2.8875) 0.9660 (0.9671) 
Station Input parameter Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE 
Wuhan N 0.7023 0.4082 (0.3933) 0.3243 (0.3095) 6.9087 (6.6338) 0.8362 (0.8479) 
N, RH 0.8094 0.3510 (0.2734) 0.2412 (0.2195) 5.4931 (5.0435) 0.9588 (0.9265) 
Guangzhou N 0.7985 0.2872 (0.2804) 0.2350 (0.2292) 5.4150 (5.2944) 0.8834 (0.8889) 
N, RH 0.9043 0.1550 (0.1526) 0.1183 (0.1168) 2.8949 (2.8875) 0.9660 (0.9671) 

Note: The figures outside brackets are obtained based on ANN model; the figures inside brackets are all based on WNN model.

Figure 4

Scatter plots between calculated and estimated ET0 using ANN and WNN models. (a) Single-factor ANN model in Wuhan. (b) Single-factor WNN model in Wuhan. (c) Double-factor ANN model in Wuhan. (d) Double-factor WNN model in Wuhan. (e) Single-factor ANN model in Guangzhou. (f) Single-factor WNN model in Guangzhou. (g) Double-factor ANN model in Guangzhou. (h) Double-factor WNN model in Guangzhou.

Figure 4

Scatter plots between calculated and estimated ET0 using ANN and WNN models. (a) Single-factor ANN model in Wuhan. (b) Single-factor WNN model in Wuhan. (c) Double-factor ANN model in Wuhan. (d) Double-factor WNN model in Wuhan. (e) Single-factor ANN model in Guangzhou. (f) Single-factor WNN model in Guangzhou. (g) Double-factor ANN model in Guangzhou. (h) Double-factor WNN model in Guangzhou.

The ANN models with single (N) or double (N and RH) parameters produced reasonable and effective estimates for the Wuhan and Guangzhou stations (Figure 4 and Table 3). For the single-parameter model for the Wuhan and Guangzhou stations, the RMSE was 0.41 and 0.29 mm, the MAE was 0.32 and 0.24 mm, the MAPE was 6.9 and 5.4%, and the NSE was 0.8362 and 0.8834, respectively. The introduction of RH significantly reduced all error statistical indices, with RMSEs of 0.35 and 0.16 mm, MAEs of 0.24 and 0.12 mm, MAPEs of 5.5 and 2.9%, and NSEs of 0.9588 and 0.9660 for the Wuhan and Guangzhou stations, respectively. Simultaneously, the estimation situations obtained by the WNN models were very similar to the above ANN models, and the statistical indices from the WNN models were mostly superior compared with the ANN models. It could be concluded that using the Gaussian wavelet function as the activation function improved the neural network model accuracy better than utilizing the basic sigmoid function, when the networks trained and validated in the same local station.

In summary, the above results indicated that all models met the requirements of good estimates and acceptable accuracy in actual application, and the double-parameter model also greatly improved the estimation accuracy and reliability compared to the single-parameter model. The error statistics were significantly reduced and the estimates were improved with the increase of P and Rdc, indicating a positive correlation between them and suggesting that the meteorological parameters N and RH could be used as the inputs for these models. Both the single- and double-parameter neural network models, which produced estimation accuracies suitable for practical application, have significant potential for agricultural application, but the universal performance of these models should be studied further.

Universal analysis of the estimation models

Investigating the universality of the estimation models in multiple regions is important for improving the performance of the neural network structures and parameters. P and Rdc were calculated in the path analysis for all selected capital stations in the south (Tables 4 and 5, respectively). The hours of sunshine, N, which had the largest P, reaching 0.60–0.85, was selected as the core variable, and the relative humidity, RH, which was the only negative parameter and had the second largest absolute value among all parameters, was selected as the limited variable. Rdc, however, fluctuated between 0.53 and 0.81 when the single parameter N was selected as the input variable, so the error oscillation may be larger when applying the models on a larger scale. Rdc increased and maintained a range of 0.8–0.9 when RH was used as the second input variable, indicating that the double-parameter model was stable and highly credible when applied in the entire selected stations.

Table 4

Path coefficients between each meteorological parameter and ET0 at the capital stations in southern China

P Tmean Tmax Tmin RH U2 N 
Guangzhou 0.0088 0.0553 0.0660 − 0.1504 0.0867 0.8216 
Nanning 0.0360 0.0686 0.0496 − 0.0929 0.0972 0.8302 
Kunming 0.0387 0.0536 0.0610 − 0.1715 0.0929 0.7940 
Haikou − 0.0111 0.1400 0.0507 − 0.1435 0.0856 0.7986 
Guiyang 0.0032 0.0659 0.1038 − 0.1827 0.1310 0.7423 
Chongqing 0.0547 0.0043 0.0831 − 0.1620 0.1294 0.7435 
Fuzhou − 0.0376 0.1019 0.1047 − 0.2326 0.1127 0.6880 
Changsha 0.0551 0.0282 0.0894 − 0.2017 0.1070 0.6825 
Hangzhou 0.0355 0.0668 0.0983 − 0.2310 0.0779 0.6673 
Shanghai − 0.0969 0.1715 0.1558 − 0.2787 0.1077 0.6463 
Nanchang 0.0668 0.0649 0.0707 − 0.1659 0.1060 0.6975 
Wuhan 0.1020 0.0231 0.0840 − 0.1439 0.1282 0.7360 
P Tmean Tmax Tmin RH U2 N 
Guangzhou 0.0088 0.0553 0.0660 − 0.1504 0.0867 0.8216 
Nanning 0.0360 0.0686 0.0496 − 0.0929 0.0972 0.8302 
Kunming 0.0387 0.0536 0.0610 − 0.1715 0.0929 0.7940 
Haikou − 0.0111 0.1400 0.0507 − 0.1435 0.0856 0.7986 
Guiyang 0.0032 0.0659 0.1038 − 0.1827 0.1310 0.7423 
Chongqing 0.0547 0.0043 0.0831 − 0.1620 0.1294 0.7435 
Fuzhou − 0.0376 0.1019 0.1047 − 0.2326 0.1127 0.6880 
Changsha 0.0551 0.0282 0.0894 − 0.2017 0.1070 0.6825 
Hangzhou 0.0355 0.0668 0.0983 − 0.2310 0.0779 0.6673 
Shanghai − 0.0969 0.1715 0.1558 − 0.2787 0.1077 0.6463 
Nanchang 0.0668 0.0649 0.0707 − 0.1659 0.1060 0.6975 
Wuhan 0.1020 0.0231 0.0840 − 0.1439 0.1282 0.7360 
Table 5

Decision contribution rates (Rdc) between each meteorological parameter and ET0 at the capital stations in southern China

Rdc Tmean Tmax Tmin RH U2 N N+RH 
Guangzhou 0.0068 0.0437 0.0312 0.1058 0.0015 0.7985 0.9044 
Nanning 0.0275 0.0567 0.0154 0.0683 0.0080 0.8136 0.8820 
Kunming 0.0262 0.0420 0.0082 0.1186 0.0303 0.7599 0.8785 
Haikou − 0.0082 0.1037 0.0194 0.1010 0.0044 0.7703 0.8713 
Guiyang 0.0024 0.0529 0.0367 0.1456 0.0373 0.7112 0.8567 
Chongqing 0.0449 0.0037 0.0486 0.1381 0.0387 0.7109 0.8490 
Fuzhou − 0.0296 0.0839 0.0587 0.1867 0.0302 0.6515 0.8382 
Changsha 0.0463 0.0238 0.0596 0.1763 0.0276 0.6530 0.8293 
Hangzhou 0.0273 0.0547 0.0520 0.1919 0.0193 0.6339 0.8258 
Shanghai − 0.0646 0.1232 0.0818 0.2148 0.0265 0.5978 0.8125 
Nanchang 0.0543 0.0540 0.0454 0.1398 0.0201 0.6707 0.8105 
Wuhan 0.0773 0.0188 0.0453 0.1071 0.0313 0.7023 0.8094 
Rdc Tmean Tmax Tmin RH U2 N N+RH 
Guangzhou 0.0068 0.0437 0.0312 0.1058 0.0015 0.7985 0.9044 
Nanning 0.0275 0.0567 0.0154 0.0683 0.0080 0.8136 0.8820 
Kunming 0.0262 0.0420 0.0082 0.1186 0.0303 0.7599 0.8785 
Haikou − 0.0082 0.1037 0.0194 0.1010 0.0044 0.7703 0.8713 
Guiyang 0.0024 0.0529 0.0367 0.1456 0.0373 0.7112 0.8567 
Chongqing 0.0449 0.0037 0.0486 0.1381 0.0387 0.7109 0.8490 
Fuzhou − 0.0296 0.0839 0.0587 0.1867 0.0302 0.6515 0.8382 
Changsha 0.0463 0.0238 0.0596 0.1763 0.0276 0.6530 0.8293 
Hangzhou 0.0273 0.0547 0.0520 0.1919 0.0193 0.6339 0.8258 
Shanghai − 0.0646 0.1232 0.0818 0.2148 0.0265 0.5978 0.8125 
Nanchang 0.0543 0.0540 0.0454 0.1398 0.0201 0.6707 0.8105 
Wuhan 0.0773 0.0188 0.0453 0.1071 0.0313 0.7023 0.8094 

In summary, this study established the ANN and WNN models based on two parameters (N and RH), extracted the corresponding parameters of network structure, chose the city of Guangzhou as the benchmark station with the highest cumulative Rdc among all stations, and applied the data for 2004–2010 from other stations to verify and universally analyse the models. The universal results from the ANN and WNN models are presented in Tables 6 and 7, respectively.

Table 6

Comparison of ET0 estimated errors based on ANN model at the capital stations in southern China

Station Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE Linear regression equation R2 
Guangzhou 0.9044 0.1415 0.1121 2.6955 0.9660 y = 0.9854x − 0.0014 0.9717 
Nanning 0.8820 0.1383 0.1023 2.4493 0.9657 y = 1.0201x − 0.0976 0.9664 
Kunming 0.8785 0.1361 0.1113 3.2502 0.8182 y = 0.8632x + 0.0783 0.9531 
Haikou 0.8713 0.1590 0.1239 2.6204 0.9399 y = 1.0861x − 0.2938 0.9615 
Guiyang 0.8567 0.1511 0.1192 3.5607 0.8316 y = 0.9297x − 0.0161 0.9627 
Chongqing 0.8490 0.1922 0.1423 3.7743 0.9683 y = 1.0624x − 0.2981 0.9734 
Fuzhou 0.8382 0.2454 0.1845 3.8932 0.8801 y = 1.1862x − 0.5786 0.9568 
Changsha 0.8293 0.2382 0.1844 3.9798 0.9329 y = 1.1785x − 0.7627 0.9608 
Hangzhou 0.8258 0.2767 0.2076 4.7596 0.9204 y = 1.1532x − 0.5945 0.9433 
Shanghai 0.8125 0.3131 0.2393 5.3582 0.8629 y = 1.1932x − 0.6010 0.9281 
Nanchang 0.8105 0.2568 0.2000 4.1421 0.9264 y = 1.1557x − 0.7046 0.9459 
Wuhan 0.8094 0.2583 0.2057 4.8053 0.9175 y = 1.0999x − 0.5577 0.9344 
Station Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE Linear regression equation R2 
Guangzhou 0.9044 0.1415 0.1121 2.6955 0.9660 y = 0.9854x − 0.0014 0.9717 
Nanning 0.8820 0.1383 0.1023 2.4493 0.9657 y = 1.0201x − 0.0976 0.9664 
Kunming 0.8785 0.1361 0.1113 3.2502 0.8182 y = 0.8632x + 0.0783 0.9531 
Haikou 0.8713 0.1590 0.1239 2.6204 0.9399 y = 1.0861x − 0.2938 0.9615 
Guiyang 0.8567 0.1511 0.1192 3.5607 0.8316 y = 0.9297x − 0.0161 0.9627 
Chongqing 0.8490 0.1922 0.1423 3.7743 0.9683 y = 1.0624x − 0.2981 0.9734 
Fuzhou 0.8382 0.2454 0.1845 3.8932 0.8801 y = 1.1862x − 0.5786 0.9568 
Changsha 0.8293 0.2382 0.1844 3.9798 0.9329 y = 1.1785x − 0.7627 0.9608 
Hangzhou 0.8258 0.2767 0.2076 4.7596 0.9204 y = 1.1532x − 0.5945 0.9433 
Shanghai 0.8125 0.3131 0.2393 5.3582 0.8629 y = 1.1932x − 0.6010 0.9281 
Nanchang 0.8105 0.2568 0.2000 4.1421 0.9264 y = 1.1557x − 0.7046 0.9459 
Wuhan 0.8094 0.2583 0.2057 4.8053 0.9175 y = 1.0999x − 0.5577 0.9344 
Table 7

Comparison of ET0 estimated errors based on WNN model at the capital stations in southern China

Station Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE Linear regression equation R2 
Guangzhou 0.9044 0.1526 0.1168 2.8875 0.9671 y = 0.9888x − 0.0177 0.9731 
Nanning 0.8820 0.1430 0.1060 2.5419 0.9641 y = 1.0217x − 0.1087 0.9652 
Kunming 0.8785 0.3439 0.3042 8.9030 0.7007 y = 0.8749x + 0.1606 0.9539 
Haikou 0.8713 0.1979 0.1426 2.9153 0.9404 y = 1.0832x − 0.2850 0.9604 
Guiyang 0.8567 0.3245 0.2831 8.5391 0.8282 y = 0.9359x − 0.0418 0.9609 
Chongqing 0.8490 0.2192 0.1629 4.4129 0.9654 y = 1.0765x − 0.3550 0.9720 
Fuzhou 0.8382 0.4271 0.3309 6.2844 0.8691 y = 1.2109x − 0.6815 0.9543 
Changsha 0.8293 0.3890 0.2933 5.7415 0.8954 y = 1.2559x − 1.0575 0.9509 
Hangzhou 0.8258 0.3561 0.2692 5.7362 0.9061 y = 1.1898x − 0.7432 0.9380 
Shanghai 0.8125 0.4500 0.3357 6.6770 0.8515 y = 1.2088x − 0.6632 0.9214 
Nanchang 0.8105 0.3544 0.2663 5.2477 0.8971 y = 1.2107x − 0.9211 0.9330 
Wuhan 0.8094 0.3114 0.2374 5.5929 0.9046 y = 1.1399x − 0.7108 0.9232 
Station Cumulative Rdc RMSE (mm) MAE (mm) MAPE (%) NSE Linear regression equation R2 
Guangzhou 0.9044 0.1526 0.1168 2.8875 0.9671 y = 0.9888x − 0.0177 0.9731 
Nanning 0.8820 0.1430 0.1060 2.5419 0.9641 y = 1.0217x − 0.1087 0.9652 
Kunming 0.8785 0.3439 0.3042 8.9030 0.7007 y = 0.8749x + 0.1606 0.9539 
Haikou 0.8713 0.1979 0.1426 2.9153 0.9404 y = 1.0832x − 0.2850 0.9604 
Guiyang 0.8567 0.3245 0.2831 8.5391 0.8282 y = 0.9359x − 0.0418 0.9609 
Chongqing 0.8490 0.2192 0.1629 4.4129 0.9654 y = 1.0765x − 0.3550 0.9720 
Fuzhou 0.8382 0.4271 0.3309 6.2844 0.8691 y = 1.2109x − 0.6815 0.9543 
Changsha 0.8293 0.3890 0.2933 5.7415 0.8954 y = 1.2559x − 1.0575 0.9509 
Hangzhou 0.8258 0.3561 0.2692 5.7362 0.9061 y = 1.1898x − 0.7432 0.9380 
Shanghai 0.8125 0.4500 0.3357 6.6770 0.8515 y = 1.2088x − 0.6632 0.9214 
Nanchang 0.8105 0.3544 0.2663 5.2477 0.8971 y = 1.2107x − 0.9211 0.9330 
Wuhan 0.8094 0.3114 0.2374 5.5929 0.9046 y = 1.1399x − 0.7108 0.9232 

The calculated statistical data estimated by ANN model in Table 6 indicated that the cumulative Rdc decreased smoothly from 0.90 to 0.80 at each station, RMSE climbed gradually from 0.14 to 0.31 mm, MAE grew modestly from 0.11 to 0.24 mm, MAPE increased slightly from 2.5 to 5.4%. R2 remained within the excellent range of 0.928–0.973, NSE could be maintained at a level of above 0.80. In addition, the fitted equations and higher R2 values for all stations indicated high estimation accuracy and consistent universal performance. These results suggested that the ANN model established for the Guangzhou station based on two meteorological parameters (N and RH) had a strong regional universality in southern China.

Despite the fact that the WNN model has a higher performance of elasticity and plasticity, the universal estimation results listed in Table 7 are not as satisfactory as the traditional ANN model. Among all the stations, two-thirds of RMSE exceeded 0.30 mm, half of MAE and MAPE were over 0.24 mm and 5.5%. Especially in Fuzhou and Shanghai stations, the RMSE and MAE peaked at around 0.45 mm and 0.34 mm. Furthermore, for the Kunming and Guiyang stations, the MAPE soared to 8.9% and 8.5%, the NSE dropped to 0.70 and 0.82. Given that the fitting equation slope a was less than 1 and the intercept b was very close to 0, it could be inferred that the WNN model overestimated the ET0 values partly attributed to the higher latitude values in these two cities. Fortunately, the fitted equations and R2 values for all stations still performed well and other statistical indices ranged into an acceptable scope that could adapt the actual requirement in universal application.

Comparison of the estimation models with the empirical equations

All the validation data results from the selected 12 meteorological stations, which were calculated by the above neural network models, were mixed together and compared with some empirical models in order to further evaluate the generalization ability. The total performance of the ANN or WNN models and other empirical equations for the period of 2004–2010 are shown in Figure 5, and the statistical indices for the comparison are listed in Table 8.

Table 8

Comparison of ET0 estimated statistical indices using the different models at all the capital southern stations

Models Inputs RMSE (mm) MAE (mm) MAPE (%) NSE R2 
Hargreaves–Samani Tmean, Tmax, Tmin 0.7927 0.6580 17.1346 0.5039 0.6004 
Priestley–Taylor Tmean, Tmax, Tmin, Rs 0.5593 0.5023 12.4166 0.7530 0.9316 
ANN model N, RH 0.3152 0.2367 5.5969 0.9215 0.9341 
WNN model N, RH 0.3215 0.2374 5.4566 0.9184 0.9353 
Models Inputs RMSE (mm) MAE (mm) MAPE (%) NSE R2 
Hargreaves–Samani Tmean, Tmax, Tmin 0.7927 0.6580 17.1346 0.5039 0.6004 
Priestley–Taylor Tmean, Tmax, Tmin, Rs 0.5593 0.5023 12.4166 0.7530 0.9316 
ANN model N, RH 0.3152 0.2367 5.5969 0.9215 0.9341 
WNN model N, RH 0.3215 0.2374 5.4566 0.9184 0.9353 
Figure 5

Scatter plots of estimated ET0 with different models at all the capital southern stations. (a) Hargreaves–Samani. (b) Priestley–Taylor. (c) ANN model. (d) WNN model.

Figure 5

Scatter plots of estimated ET0 with different models at all the capital southern stations. (a) Hargreaves–Samani. (b) Priestley–Taylor. (c) ANN model. (d) WNN model.

Some famous empirical models for calculating ET0 are generally developed under specific agricultural conditions or using limited climate data, so that their calculated performance cannot be more accurate than the results obtained by the PM method. The Hargreaves–Samani model with only temperature data as inputs, presented the poorest performance with the RMSE, MAE, MAPE, NSE and R2 equal to 0.7927 mm, 0.6580 mm, 17.1346%, 0.5039 and 0.6004, respectively. All these statistical indices could be noticeably improved by adding the radiation item as the next input with using the Priestley–Taylor model, but the scatter plot in Figure 5 shows that this model overestimated most of the ET0 values, thus making the calibration process vitally necessary. While the ANN model was introduced to estimate ET0 values using only two meteorological parameters (N and RH) selected by path analysis, the performances were tremendously improved by comparison of the empirical models, at the values of 0.3152 mm, 0.2367 mm, 5.5969%, 0.9215 and 0.9341 for RMSE, MAE, MAPE, NSE and R2, respectively. For the WNN model, the MAPE and R2 were slightly better than the ANN model but the other criteria were a little worse. Actually, when the estimated results are mixed together, no significant difference was found in these two models' accuracy as the per cent changes of those corresponding indices were less than 3.0%. Overall, the neural network estimated models with fewer inputs could exhibit much better accuracy in estimating ET0 values than the empirical equations. Two meteorological parameters, N and RH, detected by the theory of path analysis, were proved to be the most crucial factors for ET0 estimation in southern China.

DISCUSSION AND CONCLUSION

This study calculated ET0 using the PM equations based on the summer meteorological data for 1969–2010 from 12 capital stations in southern China, determined the decisive variables N and RH using path analysis, established ANN and WNN models for estimating ET0 to evaluate the accuracy and reliability based on actual production needs, analysed the universal performance of the neural network models and made the comparison with some empirical equations among all stations in southern China. The following main conclusions were drawn:

  1. The path analysis identified N and RH as the two core meteorological parameters with the largest influence on ET0. N had a positive influence on ET0 and was selected as the core decisive variable, and RH had a negative influence on ET0 and was selected as the limited decisive variable.

  2. The single-parameter (N) and the double-parameter (N and RH) neural network models based on the path theory estimated ET0 accurately for the Wuhan and Guangzhou stations. The cumulative decision contribution rates to ET0 were positively correlated with the error statistical indicators, demonstrating the robustness and reliability of these estimation models.

  3. The double-parameter (N and RH) ANN and WNN models had the highest P and Rdc and the best estimation accuracy at the Guangzhou station. This local model also had higher accuracy and more consistent reliability than some empirical models when applied to other stations in southern China, confirming that this model had significant potential in agricultural applications.

In summary, the neural network estimation models with few parameters based on the principle of path analysis theory performed well, with high accuracy, consistent reliability, and robust universality. Path analysis theory thus provided a scientific basis that could feasibly be applied to choose the decisive parameters. Only two meteorological parameters (N and RH), however, could be directly applied to establish these models for estimating ET0 for actual production, whether or not the meteorological data were fully available for some regions in southern China. Moreover, when some comparisons are made by path analysis at a large scale, it is helpful and useful to extract some stations which have the same decisive parameters into the same group in order to make further universal estimation. These concise neural network models with fewer variables have higher potential and promotional value for actual production than the empirical models, not only near the large capital cities, but also in smaller neighbouring areas.

ACKNOWLEDGEMENTS

This study was supported by the National Natural Science Foundation of China (51279167), the National Science & Technology Pillar Program during the 12th Five-year Plan Period (2012BAD08B01), and the Non-profit Industry Financial Program of the Ministry of Water Resources (201301016).

REFERENCES

REFERENCES
Adeloye
A. J.
Rustum
R.
Kariyama
I. D.
2012
Neural computing modeling of the reference crop evapotranspiration
.
Environ. Modell. Softw.
29
(
1
),
61
73
.
Alexandridis
A. K.
Zapranis
A. D.
2013
Wavelet neural networks: a practical guide
.
Neural Networks
42
,
1
27
.
Allan
R. G.
Pereira
L. S.
Raes
D.
Smith
M.
1998
Crop Evapotranspiration. Guidelines for Computing crop Water Requirements
.
FAO Irrigation and Drainage Paper No. 56
,
FAO, Rome
,
Italy
.
Cao
H. X.
Su
X. L.
Kang
S. Z.
Sun
H. Y.
2007
Changes of reference crop evapotranspiration and causes in Guanzhong Region of Shaanxi Province
.
Trans. CSAE
23
(
11
),
8
16
.
Chen
Y. M.
1995
Main Crop Water Requirement and Irrigation of China
.
Water Resources and Electric Power Press
,
Beijing
,
China
.
Cui
Y. L.
Ma
C. X.
Shen
X. Z.
Ma
J. G.
2005
Predicting reference evapotranspiration based on artificial neural network with genic arithmetic
.
Adv. Water Sci.
16
,
76
81
.
Dinpashoh
Y.
Jhajharia
D.
Fakheri-Fard
A.
Singh
V. P.
Kahya
E.
2011
Trends in reference crop evapotranspiration over Iran
.
J. Hydrol.
399
(
3–4
),
422
433
.
Hargreaves
G. H.
Samani
Z. A.
1985
Reference crop evapotranspiration from temperature
.
Appl. Eng. Agric.
1
,
96
99
.
Huo
Z. L.
Shi
H. B.
Chen
Y. X.
Wei
Z. M.
Qu
Z. Y.
2004
Spatio-temporal variation and dependence analysis of ET0 in north arid and cold region
.
Trans. CSAE
20
(
6
),
60
63
.
Kumar
M.
Raghuwanshi
N. S.
Singh
R.
Wallender
W. W.
Pruitt
W. O.
2002
Estimating evapotranspiration using artificial neural network
.
J. Irrig. Drain. E-ASCE
128
(
4
),
224
233
.
Nova
N. A. V.
Pereira
A. B.
Shock
C. C.
2007
Estimation of reference evapotranspiration by an energy balance approach
.
Biosyst. Eng.
96
(
4
),
605
615
.
Sarawgi
A. K.
Rastogi
N. K.
Soni
D. K.
1997
Correlation and path analysis in rice accessions from Madhya Pradesh
.
Field Crops Res.
52
,
161
167
.
Shiri
J.
Nazemi
A. H.
Sadraddini
A. A.
Landeras
G.
Kisi
O.
Fard
A. F.
Marti
P.
2013
Global cross-station assessment of neuro-fuzzy models for estimating daily reference evapotranspiration
.
J. Hydrol.
480
,
46
57
.
Shiri
J.
Nazemi
A. H.
Sadraddini
A. A.
Landeras
G.
Kisi
O.
Fard
A. F.
Marti
P.
2014
Comparison of heuristic and empirical approaches for estimating reference evapotranspiration from limited inputs in Iran
.
Comput. Electron. Agr.
108
,
230
241
.
Shiri
J.
Sadraddini
A. A.
Nazemi
A. H.
Marti
P.
Fard
A. F.
Kisi
O.
Landeras
G.
2015b
Independent testing for assessing the calibration of the Hargreaves-Samani equation: new heuristic alternatives for Iran
.
Comput. Electron. Agr.
117
,
70
80
.
Stafford
R. E.
Seiler
G. J.
1986
Path coefficient analyses of yield components in guar
.
Field Crops Res.
14
,
171
179
.
Temesgen
B.
Eching
S.
Davidoff
B.
Frame
K.
2005
Comparison of some reference evapotranspiration equations for California
.
J. Irrig. Drain. E-ASCE
131
(
1
),
73
84
.
Trajkovic
S.
Kolakovic
S.
2009a
Estimating reference evapotranspiration using limited weather data
.
J. Irrig. Drain. E-ASCE
135
(
4
),
443
449
.
Trajkovic
S.
Kolakovic
S.
2009b
Wind-adjusted Turc equation for estimating reference evapotranspiration at humid European locations
.
Hydrol. Res.
40
(
1
),
45
52
.
Zhang
Q.
Benveniste
A.
1992
Wavelet networks
.
IEEE Trans. Neural Netw.
3
(
6
),
889
898
.