The improvement of methods for estimating reference evapotranspiration (ET0) requiring few climatic inputs is crucial, due to the partial or total lack of climatic inputs in many situations. The current paper compares the effect of local and external training procedures in neuro-fuzzy and neural network models for estimating ET0 relying on two input combinations considering k-fold testing. Therefore, different data set configurations were defined based on temporal and spatial criteria allowing for a complete and suitable testing scan of the complete data set. The proposed methodology enabled the comparison in each station of models trained with local data series and models trained with the data series from the remaining stations. Results showed that the external training based on a suitable input choice and a representative pattern collection might be a valid alternative to the more common local training.
INTRODUCTION
Accurate estimation of evapotranspiration (ET) (the process of water loss to the atmosphere by the combined processes of evaporation and transpiration), is essential for the computation of crop water requirements, water resources management, water balance analysis, selecting the crop pattern of agricultural lands, modeling crop water production functions, and determination of the water budget, especially under arid conditions, where water resources are scarce and fresh water is a limited resource. ET can be quantified directly by relatively high cost aerodynamic as well as irradiative Bowen ratio methods or by utilization of lysimeters based on a water balance in a controlled crop area (Allen et al. 1998). Notable researches have been carried out so far for studying the physical laws governing the ET phenomenon on the analytical base leading to the evolution of some basic ET concepts (Katerji & Rana 2011).
The reference term ET (ET0) was introduced by the Food and Agriculture Organization of the United Nations (FAO) as a methodology for computing crop evapotranspiration regardless of crop type, its stage of development and its management (Doorenbos & Pruitt 1977), because the interdependence of the factors affecting the ET makes the study of the evaporative demand of the atmosphere difficult. According to Allen et al. (1998), ET0 is the evapotranspiration from a hypothetical crop having 0.12 m height, 0.23 albedo, and fixed canopy resistance of 69 s/m. In this way, the Penman–Monteith model has been adopted as a reference equation for estimating ET0 and calibrating other ET0 equations (Allen et al. 1998) in the absence of experimental ET0 values.
According to Landeras et al. (2008), the adapted Penman–Monteith equation (which will be referred to as FAO56-PM in short) has two important advantages: (1) it can be applied in a great variety of environments and climate scenarios without local calibration; and (2) it has been validated using lysimeters under a wide range of climatic conditions. Nevertheless, the need for a large number of meteorological variables (e.g., air temperature, relative humidity, solar radiation, and wind speed) is a major disadvantage of the FAO56-PM equation.
During recent years, artificial intelligence (AI) approaches (e.g., artificial neural networks (ANN), adaptive neuro-fuzzy inference system (ANFIS), etc.) have been widely applied in water resources engineering issues. Broad evidence has shown that AI techniques can be successfully applied in modeling water resources engineering components (e.g., ASCE 2000; Cigizoglu & Kisi 2005, 2006; Kisi 2008; Shiri & Kisi 2011a, b; Kim et al. 2012; Kisi & Shiri 2012; Landeras et al. 2012; Shiri et al. 2012). Artificial neural networks (ANNs) have been widely applied for estimating evaporation as well as ET0 values (e.g., Kim et al. 2009, 2012; Kumar et al. 2011).
Cigizoglu & Kisi (2005) predicted daily streamflows by three back-propagation techniques using k-fold partitioning of neural network training data and they showed that with a data period much shorter than the whole training duration similar flow prediction performance could be obtained. Cigizoglu & Kisi (2006) used k-fold partitioning in suspended sediment estimation and reported that partitioning of the training data set showed that similar or even superior sediment estimation performances can be obtained with quite limited data provided that the training data statistics of the subset are close to those of the testing data.
ANFIS is a combination of an adaptive neural network and a fuzzy inference system (FIS). An adaptive neural network is a superset of all kinds of feed-forward neural networks (Jang 1993). The parameters of the FIS are determined by the ANN learning algorithms. Since this system is based on the FIS, reflecting extensive knowledge, an important aspect is that the system should always be interpretable in terms of fuzzy IF-THEN rules. ANFIS is capable of approximating any real continuous function on a compact set (Jang et al. 1997).
Kisi (2006) investigated the ability of ANFIS technique to improve the accuracy of daily pan evaporation estimation. Kisi & Ozturk (2007) used the ANFIS computing technique for ET0 estimation. Shiri et al. (2011) compared ANFIS to ANNs for modeling daily pan evaporation values. Kisi et al. (2012) developed and validated a generalized neuro-fuzzy based model for estimating daily pan evaporation values using weather data. Pour Ali Baba et al. (2013) applied ANFIS model for estimating ET0 using available and estimated climatic variables. Kim et al. (2013) applied ANN and ANFIS techniques for modeling daily pan evaporation values using different lag-time patterns to analyze the effect of the data time series on modeling accuracy.
The assessment of AI models for ET0 estimation is usually based on a single assignment of the data sets required for the application of the training algorithm and for testing. As stated by Martí et al. (2011a), this evaluation of the model performance, based on a single and limited test pattern collection, might be misleading or partially valid. In order to perform a suitable assessment of the AI model the leave-one-out (Stone 1974; Shao 1993; Mehrotra & Sharma 2005; Hrachowitz et al. 2010) based data set configurations offer a good solution. According to them, a complete testing scan of the data set can be fulfilled based on a previously defined minimum test set size or a maximum number of train-test stages. The most rigorous approach would be to leave a single pattern for testing (leave-one-out). However, this would involve too high computational costs; so, usually a larger test size is considered (k-fold testing). Through the k-fold test (where k refers to the number of necessary train-test stages) the data set is scanned according to the established test size or, conversely, according to the defined number of train–test stages. In addition to the single data set assignment, most AI approaches dealing with ET0 estimation consider a local calibration of the models, i.e., models are trained and tested using data from the same stations. Only few studies have tackled the external performance of an AI model, i.e., when the test patterns belong to a station not considered for training (Kisi 2007; Martí & Gasque 2010; Martí et al. 2010, 2011a, b; Shiri et al. 2011, 2013a; Kisi et al. 2012; Pour Ali Baba et al. 2013). However, these did consider neither leave-one-out procedures nor k-fold testing.
Recently, Shiri et al. (2013b) compared local and external training procedures of gene expression programming (GEP) models for estimating pan evaporation values in six stations in the USA using various input configurations based on physical/empirical pan evaporation estimation models. Results suggested that external training might be a valid alternative to local training if the models were fed with a suitable combination of inputs. This work aims at application of that data management scenario using ANFIS and ANN models for estimating ET0 in a different climatic context, Iran. Therefore, two input configurations were defined and the data set was split up in several training and testing configurations according to temporal and spatial criteria.
MATERIALS AND METHODS
Data set and input combinations
In the present paper, daily weather data from five weather stations in Iran were used for modeling ET0. The data sample consisted of daily maximum and minimum air temperatures (Tmax and Tmin, respectively), relative humidity (RH), wind speed (SW) at 2 m above land surface, and solar radiation (RS) covering a period of 6 years (from January 2003 to December 2008). The complete information of the studied weather stations as well as the average values of the considered meteorological parameters is given in Table 1. Also, Table 2 sums up the temporal variations of ET0 values in the studied stations. From both tables the spatial and temporal variations of the ET0 values (in terms of global average and standard deviation values) are observed. Due to the absence of experimental measurements, ET0 values calculated according to FAO56-PM equation were considered as the targets for the ANFIS/ANN implementation and evaluation, which is an accepted and very common practice in this situation, in agreement with the FAO recommendation (Allen et al. 1998). The considered input combinations used to feed the ANFIS/ANN models in the present paper are
(1) Tmax, Tmin, Tmean, Ra (ANFIS1, ANN1)
(2) Tmean, RS, RH (ANFIS2, ANN2)
. | UTM coordinate . | Meteorological parameters . | |||||||
---|---|---|---|---|---|---|---|---|---|
Station . | φ (°N) . | τ (°E) . | z (m) . | Tmean (°C) . | ΔT (°C) . | RH (%) . | SW (m/s) . | RS (MJ/m2d) . | ET0 (mm/day) . |
Bojnurd | 37.28 | 57.16 | 1,112 | 13.5 | 12.9 | 87.9 | 2.4 | 16.7 | 3.1 |
Quazvin | 36.15 | 50.3 | 1,279.2 | 14.5 | 14.4 | 78.3 | 1.9 | 18.3 | 3.8 |
Shiraz | 29.32 | 52.36 | 1,484 | 18.4 | 16 | 60.1 | 2.2 | 20.3 | 4.1 |
Tehran | 35.41 | 51.19 | 1,190.8 | 18.6 | 10 | 73.8 | 3.2 | 15.3 | 3.4 |
Zanjan | 36.41 | 48.29 | 1,665 | 11.5 | 14 | 63.1 | 3.2 | 16.6 | 3.1 |
Average | 15.3 | 13.46 | 72.64 | 2.58 | 17.44 | 3.5 | |||
Standard deviation | 3.11 | 2.23 | 11.34 | 0.59 | 1.92 | 0.44 |
. | UTM coordinate . | Meteorological parameters . | |||||||
---|---|---|---|---|---|---|---|---|---|
Station . | φ (°N) . | τ (°E) . | z (m) . | Tmean (°C) . | ΔT (°C) . | RH (%) . | SW (m/s) . | RS (MJ/m2d) . | ET0 (mm/day) . |
Bojnurd | 37.28 | 57.16 | 1,112 | 13.5 | 12.9 | 87.9 | 2.4 | 16.7 | 3.1 |
Quazvin | 36.15 | 50.3 | 1,279.2 | 14.5 | 14.4 | 78.3 | 1.9 | 18.3 | 3.8 |
Shiraz | 29.32 | 52.36 | 1,484 | 18.4 | 16 | 60.1 | 2.2 | 20.3 | 4.1 |
Tehran | 35.41 | 51.19 | 1,190.8 | 18.6 | 10 | 73.8 | 3.2 | 15.3 | 3.4 |
Zanjan | 36.41 | 48.29 | 1,665 | 11.5 | 14 | 63.1 | 3.2 | 16.6 | 3.1 |
Average | 15.3 | 13.46 | 72.64 | 2.58 | 17.44 | 3.5 | |||
Standard deviation | 3.11 | 2.23 | 11.34 | 0.59 | 1.92 | 0.44 |
φ: latitude; τ: longitude; z: altitude with respect to sea level; Tmean: daily mean air temperature; ΔT: difference between maximum and minimum air temperature; RH: relative humidity; SW: wind speed at 2 m height above ground surface; RS: solar radiation; ET0: daily FAO56-PM evapotranspiration.
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
Annual ET0 (mm/year) | |||||
2003 | 1,013.13 | 1,338.25 | 1,499.43 | 1,204.93 | 1,441.52 |
2004 | 1,134.51 | 1,341.50 | 1,526.98 | 1,220.14 | 1,437.47 |
2005 | 1,008.16 | 1,357.14 | 1,473.85 | 1,216.15 | 1,438.52 |
2006 | 1,181.49 | 1,409.29 | 1,487.29 | 1,271.38 | 1,418.93 |
2007 | 1,120.45 | 1,331.19 | 1,466.74 | 1,231.44 | 1,351.84 |
2008 | 1,130.24 | 1,450.56 | 1,535.58 | 1,302.59 | 1,460.82 |
Average | 1,097.99 | 1,371.32 | 1,498.31 | 1,241.10 | 1,424.85 |
Standard deviation | 70.88 | 47.98 | 28.03 | 37.84 | 38.16 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
Annual ET0 (mm/year) | |||||
2003 | 1,013.13 | 1,338.25 | 1,499.43 | 1,204.93 | 1,441.52 |
2004 | 1,134.51 | 1,341.50 | 1,526.98 | 1,220.14 | 1,437.47 |
2005 | 1,008.16 | 1,357.14 | 1,473.85 | 1,216.15 | 1,438.52 |
2006 | 1,181.49 | 1,409.29 | 1,487.29 | 1,271.38 | 1,418.93 |
2007 | 1,120.45 | 1,331.19 | 1,466.74 | 1,231.44 | 1,351.84 |
2008 | 1,130.24 | 1,450.56 | 1,535.58 | 1,302.59 | 1,460.82 |
Average | 1,097.99 | 1,371.32 | 1,498.31 | 1,241.10 | 1,424.85 |
Standard deviation | 70.88 | 47.98 | 28.03 | 37.84 | 38.16 |
The obtained results are demonstrated for two time periods: (a) for the complete study period (including the annual results of the applied models); and (b) for the warmest period (including the time period between May and August). The time period (b) was selected since the study of the performance of the models during the warmest period of the year is crucial from irrigation and water allocation points of view.
Adaptive neuro-fuzzy inference system (ANFIS)
There are two approaches for FISs, namely the approach of Mamdani (Mamdani & Assilian 1975) and the approach of Sugeno (Takagi & Sugeno 1985). The neuro-fuzzy model used in this study implements Sugeno's fuzzy approach to obtain the values of the output variable from those of the input variables. In the implementation of fuzzy logic, several types of membership functions (MFs) (the curves that define how each pattern in the input variables is mapped to a degree of membership between 0 and 1) can be used. However, recent studies have shown that the type of MF does not affect the results fundamentally (Vernieuwe et al. 2005). In the present study, the triangular MFs were used, as they are commonly used for practical applications (Russel & Campbell 1996). The number of MFs was determined iteratively. Here, two or three triangular MFs were sufficient for establishing the models based on trial and error. A large number of MFs of input variables should be avoided for the sake of saving time and computational efforts (Keskin et al. 2004). The hybrid optimization method (the combination of least squares and back-propagation gradient descent methods) was used for training the membership functions parameters to emulate the calibration (train) data. The grid partitioning identification methods of the Sugeno FIS models are applied for mapping the nonlinear relationship among the input–output variables. Here the input variables of the ANFIS models are the weather data and the output layer corresponds to the ET0 values. The grid partitioning method proposes independent partitions of each antecedent variable through defining the membership functions of all antecedent variables. The ANFIS approaches were implemented using MATLAB. Figure 1 illustrates a schematic representation of the ANFIS model for ET0 estimation using weather data.
As a simple example a FIS with two inputs x1 and x2 and one output y is explained. Here, x1 and x2 might correspond, for instance, to mean air temperature Tmean and solar radiation RS, while the output y would represent the reference evapotranspiration (ET0). Suppose that the rule base contains two fuzzy IF-THEN rules
Artificial neural networks
ANNs are basically parallel information-processing systems. The internal architecture of ANNs is similar to the structure of a biological brain with a number of layers of fully interconnected nodes or neurons. Each neuron is connected to other neurons by means of direct communication links, each with an associated weight. The neural network usually has two or more layers of neurons in order to process nonlinear signals. The input layer admits the incoming information, which is processed by the hidden layer(s), and the output layer presents the network result. During the learning process, the weights of the interconnections and the neural biases are adjusted in trial and error procedures, to minimize the errors. Figure 2 illustrates a schematic ANN structure. Multilayer feed forward networks with radial basis functions were used for modeling ET0. The basic details and concepts of the working of an ANN can be found in Bishop (1995) or Haykin (1999).
K-fold testing
To perform a suitable assessment of the model performance, the data set was scanned in several successive training–test stages, ensuring that all the patterns were tested. The assignment of the required train and test set configurations was based on the k-fold testing approach, defining previously a minimum assumable test set size. Additionally, the test set was defined according to two different criteria, a spatial/external (S) and a temporal/local (T). Accordingly, the S-test was performed as follows. In each stage, the data set of one station was reserved for testing, whereas the data sets of the remaining four stations were used for the application of the training algorithm. Thus, the S-testing involved five train–test stages (five-fold testing), including one per test station. In this case, 7,702 patterns were available for training, while 2,192 patterns were used for testing. On the other hand, a T-test was performed per station. Namely, defining a minimum test size of 1 year, in each stage of the T-test, the annual data of one station were reserved for testing, while the other 5 years of that station were used for training. Accordingly, six train–test stages (six-fold test), one per year, were necessary to fulfill the temporal leave-one-out per station. In each stage, 1,825 patterns were available for training, while 365 patterns were used for testing. Thus, 30 (5 × 6) train–test stages were required to perform the T-test approach.
Statistical indices
Four statistical evaluation parameters were used to assess the models' performances
RESULTS AND DISCUSSION
Table 3 presents the global average statistical parameters (for the annual and seasonal periods) obtained for the two neuro-fuzzy and neural network approaches using local (T) and external (S) training, respectively. For instance, the parameters of the first row correspond to the average of the six-fold temporal test (one per year) in the five stations (i.e., 30 models). Similarly, the second row presents the parameters of the five-fold spatial test (one per station), i.e., the predicted vectors are put together and the statistical indexes are calculated for the global estimation vector. Subsequently, the first row of the second subsection (average seasonal results) corresponds to the average of the six-fold temporal test (one per year) in the five stations, during the warmest period of each year.
Model . | Inputs . | Training . | MAE (mm/day) . | SI . | NS . | r2 . |
---|---|---|---|---|---|---|
Average annual results | ||||||
Input combination (i) | ||||||
ANFIS1 | Tmean, Tmax, Tmin, Ra | Local | 0.35 | 0.12 | 0.945 | 0.947 |
External | 0.47 | 0.18 | 0.920 | 0.945 | ||
ANN1 | Tmean, Tmax, Tmin, Ra | Local | 0.35 | 0.13 | 0.942 | 0.954 |
External | 0.45 | 0.17 | 0.917 | 0.945 | ||
Input combination (ii) | ||||||
ANFIS2 | Tmean, Rs, RH | Local | 0.30 | 0.10 | 0.959 | 0.957 |
External | 0.39 | 0.15 | 0.940 | 0.950 | ||
ANN2 | Tmean, Tmax, Tmin, Ra | Local | 0.32 | 0.11 | 0.958 | 0.954 |
External | 0.39 | 0.14 | 0.831 | 0.951 | ||
Average seasonal results | ||||||
Input combination (i) | ||||||
ANFIS1 | Tmean, Tmax, Tmin, Ra | Local | 0.43 | 0.09 | 0.943 | 0.743 |
External | 0.66 | 0.14 | 0.892 | 0.710 | ||
ANN1 | Tmean, Tmax, Tmin, Ra | Local | 0.45 | 0.07 | 0.956 | 0.744 |
External | 0.55 | 0.12 | 0.915 | 0.692 | ||
Input combination (ii) | ||||||
ANFIS2 | Tmean, Rs, RH | Local | 0.37 | 0.08 | 0.958 | 0.748 |
External | 0.57 | 0.12 | 0.911 | 0.745 | ||
ANN2 | Tmean, Tmax, Tmin, Ra | Local | 0.38 | 0.06 | 0.960 | 0.736 |
External | 0.41 | 0.10 | 0.948 | 0.745 |
Model . | Inputs . | Training . | MAE (mm/day) . | SI . | NS . | r2 . |
---|---|---|---|---|---|---|
Average annual results | ||||||
Input combination (i) | ||||||
ANFIS1 | Tmean, Tmax, Tmin, Ra | Local | 0.35 | 0.12 | 0.945 | 0.947 |
External | 0.47 | 0.18 | 0.920 | 0.945 | ||
ANN1 | Tmean, Tmax, Tmin, Ra | Local | 0.35 | 0.13 | 0.942 | 0.954 |
External | 0.45 | 0.17 | 0.917 | 0.945 | ||
Input combination (ii) | ||||||
ANFIS2 | Tmean, Rs, RH | Local | 0.30 | 0.10 | 0.959 | 0.957 |
External | 0.39 | 0.15 | 0.940 | 0.950 | ||
ANN2 | Tmean, Tmax, Tmin, Ra | Local | 0.32 | 0.11 | 0.958 | 0.954 |
External | 0.39 | 0.14 | 0.831 | 0.951 | ||
Average seasonal results | ||||||
Input combination (i) | ||||||
ANFIS1 | Tmean, Tmax, Tmin, Ra | Local | 0.43 | 0.09 | 0.943 | 0.743 |
External | 0.66 | 0.14 | 0.892 | 0.710 | ||
ANN1 | Tmean, Tmax, Tmin, Ra | Local | 0.45 | 0.07 | 0.956 | 0.744 |
External | 0.55 | 0.12 | 0.915 | 0.692 | ||
Input combination (ii) | ||||||
ANFIS2 | Tmean, Rs, RH | Local | 0.37 | 0.08 | 0.958 | 0.748 |
External | 0.57 | 0.12 | 0.911 | 0.745 | ||
ANN2 | Tmean, Tmax, Tmin, Ra | Local | 0.38 | 0.06 | 0.960 | 0.736 |
External | 0.41 | 0.10 | 0.948 | 0.745 |
As could be expected, the most accurate estimations correspond to locally trained ANFIS2 and ANN2, followed by those of locally trained ANFIS1 and ANN1, externally trained ANFIS2 and ANN2, and externally trained ANFIS1 and ANN1, respectively. Thus, according to the SI values, the consideration of RH and RS improves the global average accuracy in 0.02 in the locally trained models, and in 0.03 in the externally trained models. The combined effect of RH and RS on ET0 will be analyzed per station in the next paragraph. The accuracy decrease in SI terms derived from considering external training instead of local training is 0.06 for ANFIS1, 0.04 for ANN1, 0.05 for ANFIS2, and 0.03 for ANN2, respectively. Thus, ANFIS2 and ANN2 models (RS–RH-based models) might allow for a higher spatial generalizability, due to a more suitable input–output mapping. However, the most interesting results can be stated when comparing external ANFIS2/ANN2 and local ANFIS1/ANN1 estimations (0.15 vs. 0.12 of SI for ANFIS and 0.14 vs. 0.14 of SI for ANN, respectively). This result reveals that it might be preferable to train external ANFIS2/ANN2 models, rather than local ANFIS1/ANN1 models, because the accuracy of the prediction might be only slightly worse. On the other hand, externally trained models exempt us from the need of data availability in the test station to train a local model, which might be a decisive advantage. The accuracy differences between locally and externally trained models seem to decrease with a suitable input selection allowing for an optimum input–output mapping. The same conclusions can be drawn on the basis of the other performance parameters.
The results of the models during the warmest period (May–August) present a similar trend, where the locally trained ANFIS2/ANN2 models are the most accurate models followed by the locally trained ANFIS1/ANN1 models. The externally trained ANFIS2/ANN2 and ANFIS1/ANN1models are ranked as the third and fourth models (based on their accuracy), respectively. In similarity to the global average annual results, the SI differences between the externally trained ANFIS21/ANN2 models with locally trained AFIS1/ANN1 models are 0.03 and the same conclusions might be drawn as for the annual results.
Figures 3 and 4 present the performance parameters of the ANFIS and ANN models split up per test station for the total and warmest periods, respectively. In each station, the results of the local training (T-parameters) correspond to the global six-fold temporal testing, whereas the S-parameters correspond to each individual stage of the five-fold spatial testing. This analysis allows evaluation of whether locally calibrated models are always more accurate than externally calibrated ones or if this depends on the specific climatic conditions of each station. Attending to temperature-based models (ANFIS1/ANN1), local calibration is always more accurate, except in stations 2 and 4, where both local and external approaches present similar accuracy. In these stations, although being locally trained, there might be considerable fluctuations in the variables’ ranges throughout the studied years. Or conversely, the input–output mapping encountered for the external training stations is also suitable for the testing station patterns. In stations 1, 3, and 5 the testing patterns might be too different from the training patterns in externally/spatially calibrated models, and the encountered input–output mapping might allow no generalization in those testing stations for that input combination. The SI differences between the T- and S-approach in these stations are 0.08, 0.08, and 0.13 (for ANFIS1) and 0.04, 0.03, and 0.09 (for ANN1), respectively. Attending to the ANFIS2/ANN2 models, the performance parameters present a qualitatively similar trend, i.e., T-approaches are more accurate. However, the performance of the S-models is more accurate and less fluctuating than for ANFIS1/ANN1 (average SI of 0.125 vs. 0.16 for ANFIS, and 0.12 vs. 0.14 for ANN, respectively), excluding station 5, which presents, comparatively, a very high error (SI of 0.23 vs. 0.28, and 0.21 vs. 0.28 for ANFIS and ANN models, respectively). According to these results, the consideration of RH and RS in addition to temperature allows for an improvement of the generalizability in stations 1, 3, 4, and 5, due to a more suitable input–output mapping. Further, station 5 might present different climatic patterns or variables’ ranges than the other stations because of presenting a clearly higher altitude and, therefore, sufficient accurate estimations are not achieved. The risk that the input–output mapping defined from the training patterns is not valid for the testing patterns increases the lower the number (and/or the level of significance) of considered inputs. Nevertheless, the consideration of a wider and more representative training set might be enough to deal with this limitation. Attending to the local approaches, the consideration of RH and RS in addition to Tmean only involves an improvement in stations 1 and 2 (0.04 and 0.05 of SI lower for ANFIS, and 0.06 and 0.04 of SI lower for ANN, respectively). Thus, the local performance of the models in stations 3, 4, and 5 is unaffected by RH and RS, or, conversely, temperature-based model allows for a suitable mapping between the training and testing patterns.
Comparing the performance of T-ANFIS1/T-ANN1 and S-ANFIS2/S-ANN2 estimations per station, Figures 3 and 4 show that the locally trained models are more accurate in stations 3 (by 0.03 of SI), 4 (by 0.01 of SI), and 5 (by 0.08 of SI), whereas externally trained models are more accurate in station 2 (by 0.01 of SI). In station 1 (Bojnurd), both approaches present the same SI. Neglecting station 5 (Zanjan), where the inaccuracy of the estimation might be dealt with using a more representative training set, these results suggest that it might be preferable to train externally a model relying on a suitable input combination than a local model. The estimation accuracy might be similar or only slightly worse, and we would not require local data series in the test stations for training a local model. Finally, considering stations where RH and RS might be more significant in the estimation of ET0, the accuracy decrease of S-approaches in comparison to T-approaches might be even lower.
Tables 4,56–7 present the performance parameters of the locally trained ANFIS and ANN models split up per station and test year. Attending to the SI values of Bojnurd in the tables, a high variability can be stated throughout the considered period. ANFIS1 performance ranges between 0.05 (2007 and 2008) and 0.24 (2003), ANFIS2 performance ranges between 0.03 (2008) and 0.17 (2003), ANN1 performance ranges between 0.05 (2008) and 0.32 (2004), and ANN2 performance ranges between 0.05 (2008) and 0.15 (2003). Similarly, the SI performance of ANFIS1 in Quazvin ranges between 0.12 (2005) and 0.2 (2003). This variability can be linked to the relationships/differences between the training and testing patterns. The key point in this regard is that the consideration of a single data set assignment for testing and for the application of the training algorithm would have not allowed a suitable assessment of the model performance, leading to partially valid conclusions. Therefore, the application of this single data set assignment procedure, which is a very common practice, should be questioned. Leaving out one procedure, or at least k-fold testing if the computational costs of the former one are not assumable, can be a good choice to face this limitation. Regarding the performance in Shiraz, Tehran, and Zanjan, throughout the considered period a lower variability can be stated. As mentioned, this might be caused by a lower climatic variability within the considered years.
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.47 | 0.39 | 0.40 | 0.31 | 0.50 |
2004 | 0.39 | 0.40 | 0.33 | 0.30 | 0.47 |
2005 | 0.27 | 0.33 | 0.28 | 0.31 | 0.45 |
2006 | 0.48 | 0.36 | 0.25 | 0.29 | 0.44 |
2007 | 0.13 | 0.33 | 0.27 | 0.26 | 0.49 |
2008 | 0.13 | 0.37 | 0.30 | 0.29 | 0.46 |
SI | |||||
2003 | 0.24 | 0.20 | 0.10 | 0.12 | 0.16 |
2004 | 0.19 | 0.16 | 0.10 | 0.12 | 0.15 |
2005 | 0.11 | 0.12 | 0.08 | 0.12 | 0.15 |
2006 | 0.22 | 0.13 | 0.08 | 0.11 | 0.14 |
2007 | 0.05 | 0.13 | 0.08 | 0.10 | 0.16 |
2008 | 0.05 | 0.13 | 0.09 | 0.11 | 0.15 |
NS | |||||
2003 | 0.869 | 0.894 | 0.943 | 0.956 | 0.921 |
2004 | 0.907 | 0.938 | 0.951 | 0.952 | 0.936 |
2005 | 0.967 | 0.960 | 0.968 | 0.954 | 0.934 |
2006 | 0.864 | 0.950 | 0.972 | 0.960 | 0.945 |
2007 | 0.990 | 0.957 | 0.969 | 0.964 | 0.924 |
2008 | 0.991 | 0.956 | 0.965 | 0.959 | 0.935 |
r2 | |||||
2003 | 0.869 | 0.898 | 0.946 | 0.956 | 0.924 |
2004 | 0.916 | 0.938 | 0.953 | 0.951 | 0.938 |
2005 | 0.990 | 0.962 | 0.970 | 0.956 | 0.934 |
2006 | 0.870 | 0.961 | 0.972 | 0.960 | 0.946 |
2007 | 0.990 | 0.959 | 0.970 | 0.964 | 0.928 |
2008 | 0.992 | 0.960 | 0.965 | 0.960 | 0.935 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.47 | 0.39 | 0.40 | 0.31 | 0.50 |
2004 | 0.39 | 0.40 | 0.33 | 0.30 | 0.47 |
2005 | 0.27 | 0.33 | 0.28 | 0.31 | 0.45 |
2006 | 0.48 | 0.36 | 0.25 | 0.29 | 0.44 |
2007 | 0.13 | 0.33 | 0.27 | 0.26 | 0.49 |
2008 | 0.13 | 0.37 | 0.30 | 0.29 | 0.46 |
SI | |||||
2003 | 0.24 | 0.20 | 0.10 | 0.12 | 0.16 |
2004 | 0.19 | 0.16 | 0.10 | 0.12 | 0.15 |
2005 | 0.11 | 0.12 | 0.08 | 0.12 | 0.15 |
2006 | 0.22 | 0.13 | 0.08 | 0.11 | 0.14 |
2007 | 0.05 | 0.13 | 0.08 | 0.10 | 0.16 |
2008 | 0.05 | 0.13 | 0.09 | 0.11 | 0.15 |
NS | |||||
2003 | 0.869 | 0.894 | 0.943 | 0.956 | 0.921 |
2004 | 0.907 | 0.938 | 0.951 | 0.952 | 0.936 |
2005 | 0.967 | 0.960 | 0.968 | 0.954 | 0.934 |
2006 | 0.864 | 0.950 | 0.972 | 0.960 | 0.945 |
2007 | 0.990 | 0.957 | 0.969 | 0.964 | 0.924 |
2008 | 0.991 | 0.956 | 0.965 | 0.959 | 0.935 |
r2 | |||||
2003 | 0.869 | 0.898 | 0.946 | 0.956 | 0.924 |
2004 | 0.916 | 0.938 | 0.953 | 0.951 | 0.938 |
2005 | 0.990 | 0.962 | 0.970 | 0.956 | 0.934 |
2006 | 0.870 | 0.961 | 0.972 | 0.960 | 0.946 |
2007 | 0.990 | 0.959 | 0.970 | 0.964 | 0.928 |
2008 | 0.992 | 0.960 | 0.965 | 0.960 | 0.935 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.32 | 0.26 | 0.33 | 0.31 | 0.48 |
2004 | 0.28 | 0.24 | 0.32 | 0.31 | 0.46 |
2005 | 0.20 | 0.22 | 0.25 | 0.30 | 0.45 |
2006 | 0.32 | 0.24 | 0.26 | 0.29 | 0.42 |
2007 | 0.10 | 0.23 | 0.26 | 0.28 | 0.47 |
2008 | 0.10 | 0.25 | 0.30 | 0.27 | 0.46 |
SI | |||||
2003 | 0.17 | 0.10 | 0.10 | 0.12 | 0.15 |
2004 | 0.13 | 0.09 | 0.10 | 0.12 | 0.15 |
2005 | 0.11 | 0.08 | 0.07 | 0.12 | 0.15 |
2006 | 0.15 | 0.10 | 0.08 | 0.11 | 0.13 |
2007 | 0.04 | 0.09 | 0.08 | 0.10 | 0.16 |
2008 | 0.03 | 0.09 | 0.09 | 0.10 | 0.15 |
NS | |||||
2003 | 0.935 | 0.972 | 0.950 | 0.957 | 0.930 |
2004 | 0.957 | 0.976 | 0.952 | 0.950 | 0.936 |
2005 | 0.970 | 0.979 | 0.968 | 0.956 | 0.932 |
2006 | 0.940 | 0.975 | 0.973 | 0.960 | 0.949 |
2007 | 0.994 | 0.976 | 0.969 | 0.963 | 0.925 |
2008 | 0.994 | 0.978 | 0.967 | 0.965 | 0.936 |
r2 | |||||
2003 | 0.935 | 0.972 | 0.909 | 0.956 | 0.933 |
2004 | 0.958 | 0.976 | 0.944 | 0.950 | 0.938 |
2005 | 0.956 | 0.979 | 0.971 | 0.956 | 0.933 |
2006 | 0.942 | 0.976 | 0.973 | 0.960 | 0.950 |
2007 | 0.994 | 0.977 | 0.969 | 0.963 | 0.929 |
2008 | 0.994 | 0.978 | 0.965 | 0.966 | 0.936 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.32 | 0.26 | 0.33 | 0.31 | 0.48 |
2004 | 0.28 | 0.24 | 0.32 | 0.31 | 0.46 |
2005 | 0.20 | 0.22 | 0.25 | 0.30 | 0.45 |
2006 | 0.32 | 0.24 | 0.26 | 0.29 | 0.42 |
2007 | 0.10 | 0.23 | 0.26 | 0.28 | 0.47 |
2008 | 0.10 | 0.25 | 0.30 | 0.27 | 0.46 |
SI | |||||
2003 | 0.17 | 0.10 | 0.10 | 0.12 | 0.15 |
2004 | 0.13 | 0.09 | 0.10 | 0.12 | 0.15 |
2005 | 0.11 | 0.08 | 0.07 | 0.12 | 0.15 |
2006 | 0.15 | 0.10 | 0.08 | 0.11 | 0.13 |
2007 | 0.04 | 0.09 | 0.08 | 0.10 | 0.16 |
2008 | 0.03 | 0.09 | 0.09 | 0.10 | 0.15 |
NS | |||||
2003 | 0.935 | 0.972 | 0.950 | 0.957 | 0.930 |
2004 | 0.957 | 0.976 | 0.952 | 0.950 | 0.936 |
2005 | 0.970 | 0.979 | 0.968 | 0.956 | 0.932 |
2006 | 0.940 | 0.975 | 0.973 | 0.960 | 0.949 |
2007 | 0.994 | 0.976 | 0.969 | 0.963 | 0.925 |
2008 | 0.994 | 0.978 | 0.967 | 0.965 | 0.936 |
r2 | |||||
2003 | 0.935 | 0.972 | 0.909 | 0.956 | 0.933 |
2004 | 0.958 | 0.976 | 0.944 | 0.950 | 0.938 |
2005 | 0.956 | 0.979 | 0.971 | 0.956 | 0.933 |
2006 | 0.942 | 0.976 | 0.973 | 0.960 | 0.950 |
2007 | 0.994 | 0.977 | 0.969 | 0.963 | 0.929 |
2008 | 0.994 | 0.978 | 0.965 | 0.966 | 0.936 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.46 | 0.34 | 0.31 | 0.29 | 0.48 |
2004 | 0.71 | 0.36 | 0.31 | 0.39 | 0.50 |
2005 | 0.27 | 0.31 | 0.27 | 0.29 | 0.42 |
2006 | 0.48 | 0.34 | 0.24 | 0.28 | 0.41 |
2007 | 0.13 | 0.33 | 0.27 | 0.25 | 0.47 |
2008 | 0.13 | 0.35 | 0.35 | 0.34 | 0.62 |
SI | |||||
2003 | 0.24 | 0.13 | 0.09 | 0.11 | 0.15 |
2004 | 0.32 | 0.14 | 0.10 | 0.15 | 0.16 |
2005 | 0.12 | 0.12 | 0.08 | 0.11 | 0.14 |
2006 | 0.22 | 0.13 | 0.07 | 0.11 | 0.14 |
2007 | 0.06 | 0.12 | 0.08 | 0.10 | 0.16 |
2008 | 0.05 | 0.12 | 0.10 | 0.13 | 0.20 |
NS | |||||
2003 | 0.877 | 0.960 | 0.968 | 0.961 | 0.934 |
2004 | 0.749 | 0.949 | 0.957 | 0.918 | 0.929 |
2005 | 0.967 | 0.963 | 0.970 | 0.961 | 0.942 |
2006 | 0.864 | 0.959 | 0.976 | 0.962 | 0.952 |
2007 | 0.990 | 0.961 | 0.972 | 0.967 | 0.933 |
2008 | 0.992 | 0.959 | 0.953 | 0.949 | 0.888 |
r2 | |||||
2003 | 0.877 | 0.958 | 0.961 | 0.961 | 0.935 |
2004 | 0.747 | 0.949 | 0.959 | 0.919 | 0.933 |
2005 | 0.991 | 0.964 | 0.971 | 0.962 | 0.942 |
2006 | 0.870 | 0.960 | 0.976 | 0.962 | 0.953 |
2007 | 0.991 | 0.960 | 0.972 | 0.967 | 0.933 |
2008 | 0.992 | 0.960 | 0.953 | 0.948 | 0.889 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.46 | 0.34 | 0.31 | 0.29 | 0.48 |
2004 | 0.71 | 0.36 | 0.31 | 0.39 | 0.50 |
2005 | 0.27 | 0.31 | 0.27 | 0.29 | 0.42 |
2006 | 0.48 | 0.34 | 0.24 | 0.28 | 0.41 |
2007 | 0.13 | 0.33 | 0.27 | 0.25 | 0.47 |
2008 | 0.13 | 0.35 | 0.35 | 0.34 | 0.62 |
SI | |||||
2003 | 0.24 | 0.13 | 0.09 | 0.11 | 0.15 |
2004 | 0.32 | 0.14 | 0.10 | 0.15 | 0.16 |
2005 | 0.12 | 0.12 | 0.08 | 0.11 | 0.14 |
2006 | 0.22 | 0.13 | 0.07 | 0.11 | 0.14 |
2007 | 0.06 | 0.12 | 0.08 | 0.10 | 0.16 |
2008 | 0.05 | 0.12 | 0.10 | 0.13 | 0.20 |
NS | |||||
2003 | 0.877 | 0.960 | 0.968 | 0.961 | 0.934 |
2004 | 0.749 | 0.949 | 0.957 | 0.918 | 0.929 |
2005 | 0.967 | 0.963 | 0.970 | 0.961 | 0.942 |
2006 | 0.864 | 0.959 | 0.976 | 0.962 | 0.952 |
2007 | 0.990 | 0.961 | 0.972 | 0.967 | 0.933 |
2008 | 0.992 | 0.959 | 0.953 | 0.949 | 0.888 |
r2 | |||||
2003 | 0.877 | 0.958 | 0.961 | 0.961 | 0.935 |
2004 | 0.747 | 0.949 | 0.959 | 0.919 | 0.933 |
2005 | 0.991 | 0.964 | 0.971 | 0.962 | 0.942 |
2006 | 0.870 | 0.960 | 0.976 | 0.962 | 0.953 |
2007 | 0.991 | 0.960 | 0.972 | 0.967 | 0.933 |
2008 | 0.992 | 0.960 | 0.953 | 0.948 | 0.889 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.29 | 0.27 | 0.06 | 0.31 | 0.47 |
2004 | 0.28 | 0.35 | 0.34 | 0.40 | 0.65 |
2005 | 0.30 | 0.22 | 0.29 | 0.31 | 0.43 |
2006 | 0.28 | 0.25 | 0.26 | 0.29 | 0.42 |
2007 | 0.23 | 0.23 | 0.26 | 0.27 | 0.50 |
2008 | 0.11 | 0.24 | 0.38 | 0.36 | 0.61 |
SI | |||||
2003 | 0.15 | 0.10 | 0.02 | 0.12 | 0.15 |
2004 | 0.13 | 0.12 | 0.11 | 0.16 | 0.22 |
2005 | 0.15 | 0.08 | 0.09 | 0.12 | 0.14 |
2006 | 0.13 | 0.11 | 0.08 | 0.11 | 0.13 |
2007 | 0.09 | 0.08 | 0.08 | 0.10 | 0.18 |
2008 | 0.04 | 0.08 | 0.12 | 0.13 | 0.20 |
NS | |||||
2003 | 0.950 | 0.972 | 0.997 | 0.958 | 0.936 |
2004 | 0.960 | 0.960 | 0.944 | 0.915 | 0.855 |
2005 | 0.947 | 0.979 | 0.965 | 0.956 | 0.940 |
2006 | 0.954 | 0.971 | 0.973 | 0.967 | 0.951 |
2007 | 0.976 | 0.976 | 0.972 | 0.965 | 0.915 |
2008 | 0.994 | 0.981 | 0.941 | 0.944 | 0.890 |
r2 | |||||
2003 | 0.952 | 0.972 | 0.997 | 0.959 | 0.937 |
2004 | 0.961 | 0.960 | 0.945 | 0.915 | 0.856 |
2005 | 0.951 | 0.979 | 0.965 | 0.956 | 0.941 |
2006 | 0.955 | 0.975 | 0.973 | 0.961 | 0.953 |
2007 | 0.979 | 0.977 | 0.973 | 0.965 | 0.918 |
2008 | 0.994 | 0.981 | 0.941 | 0.945 | 0.891 |
. | Bojnurd . | Quazvin . | Shiraz . | Tehran . | Zanjan . |
---|---|---|---|---|---|
MAE (mm/day) | |||||
2003 | 0.29 | 0.27 | 0.06 | 0.31 | 0.47 |
2004 | 0.28 | 0.35 | 0.34 | 0.40 | 0.65 |
2005 | 0.30 | 0.22 | 0.29 | 0.31 | 0.43 |
2006 | 0.28 | 0.25 | 0.26 | 0.29 | 0.42 |
2007 | 0.23 | 0.23 | 0.26 | 0.27 | 0.50 |
2008 | 0.11 | 0.24 | 0.38 | 0.36 | 0.61 |
SI | |||||
2003 | 0.15 | 0.10 | 0.02 | 0.12 | 0.15 |
2004 | 0.13 | 0.12 | 0.11 | 0.16 | 0.22 |
2005 | 0.15 | 0.08 | 0.09 | 0.12 | 0.14 |
2006 | 0.13 | 0.11 | 0.08 | 0.11 | 0.13 |
2007 | 0.09 | 0.08 | 0.08 | 0.10 | 0.18 |
2008 | 0.04 | 0.08 | 0.12 | 0.13 | 0.20 |
NS | |||||
2003 | 0.950 | 0.972 | 0.997 | 0.958 | 0.936 |
2004 | 0.960 | 0.960 | 0.944 | 0.915 | 0.855 |
2005 | 0.947 | 0.979 | 0.965 | 0.956 | 0.940 |
2006 | 0.954 | 0.971 | 0.973 | 0.967 | 0.951 |
2007 | 0.976 | 0.976 | 0.972 | 0.965 | 0.915 |
2008 | 0.994 | 0.981 | 0.941 | 0.944 | 0.890 |
r2 | |||||
2003 | 0.952 | 0.972 | 0.997 | 0.959 | 0.937 |
2004 | 0.961 | 0.960 | 0.945 | 0.915 | 0.856 |
2005 | 0.951 | 0.979 | 0.965 | 0.956 | 0.941 |
2006 | 0.955 | 0.975 | 0.973 | 0.961 | 0.953 |
2007 | 0.979 | 0.977 | 0.973 | 0.965 | 0.918 |
2008 | 0.994 | 0.981 | 0.941 | 0.945 | 0.891 |
The average maximum SI difference is smaller for ANFIS2 than for ANFIS1 (0.048 vs. 0.066, respectively). Thus, the consideration of RH and RS in addition to Tmean might allow for a better input–output mapping, and as a result, a lower variability within the annual performances. However, regarding the performance of both input combinations in Shiraz, Tehran, and Zanjan, it seems that RH and RS might not be so decisive in these stations when performing a local training. It can be observed that both input combinations provide the same accuracy except in 2005 (Shiraz), 2008 (Tehran), and 2003, 2006 (Zanjan), where ANFIS2 is slightly more accurate. These results confirm the conclusions of Shiri et al. (2013b) on modeling pan evaporation using GEP. Hence, based on a suitable input choice, the external training might be a valid alternative to local training. Although this procedure might provide slightly less accurate estimations, it presents the decisive advantage of not requiring data series in the test stations to train a local model. So, only climatic measurements for the testing points would be required.
Figures 5 and 6 display the scatterplots of the externally trained ANFIS and ANN models. Comparison of the fit line equations and r2 values in the scatterplots indicates that the ANFIS2/ANN2 models perform better than the ANFIS1/ANN1 models for the Bojnurd, Quazvin, Tehran, and Zanjan stations. For the Shiraz station, however, externally trained temperature-based ANFIS1/ANN1 models comprising only inputs Tmax, Tmin, Tmean, Ra seem to be more accurate than the ANFIS2/ANN2 models. Different climatic characteristics (high ΔT and ET0) of this station may be the reason for this. The underestimations of the ANFIS and ANN models are clearly seen for Zanjan station in Figures 5 and 6. The reason for this may be the fact that this station has higher ET0 data than the other stations and the externally trained models encounter extrapolation difficulties for the testing patterns. However, adding RH and RS inputs to the ANFIS and ANN models improves the models accuracies and decreases the underestimations.
The time series plot of the ANFIS and ANN models for the warmest period of year 2003 in Bojnurd station is given in Figure 7, where the overestimations of the externally trained temperature-based models can be clearly stated, especially for the ANN1 model. Locally trained ANN1 and ANFIS1 models significantly underestimate ET0 of the July period. It is clear from Figure 7 that the externally trained RS–RH based ANN2 and ANFIS2 models are more accurate than the locally trained temperature based ANN1 and ANFIS1 models. As stated before, it might be preferable to train externally a model relying on a suitable input combination than a local model.
The annual variability stated in the models’ performance confirms the need to apply data set scanning procedures (LOO, k-fold testing) to properly evaluate it, as suggested by Martí et al. (2011a, b). Further research might tackle the effect of each climatic input on the target variable and the associated improvement in the generalizability of the model considering a wider number of stations and input combinations based on a data set scanning assessment.
CONCLUSIONS
This paper reports the application of neuro-fuzzy and neural network approaches for estimating reference evapotranspiration considering a k-fold testing assessment. Different data set configurations were defined based on temporal and spatial criteria allowing for a complete testing scan of the data set. The proposed methodology enabled the comparison in each station of models trained with local data series and models trained with data series from the remaining stations, both fed by two different input combinations. Results show that the external training (i.e., using data series not belonging to the test station) based on a suitable input choice and a representative pattern collection might be a valid alternative to the more common local training. Although, comparatively, a slight accuracy decrease might be expected, this procedure presents the decisive advantage of not requiring data series in the test station for the application of the training algorithm. Only climatic measurements for the testing inputs would be required. Results also show that AI applications based on a single data assignment of the training and test sets might not be advisable for a proper evaluation of the model performance, as they might lead to only partially valid conclusions, i.e., for the single specific training and test sets defined, which might only cover a part of the complete patterns range spectrum of the considered data set.
ACKNOWLEDGEMENT
This work has been prepared as a part of Jalal Shiri's PhD thesis.