Abstract

Accurately modeling pan evaporation is important in water resources planning and management and also in environmental engineering. This study compares the accuracy of two new data-driven methods, multi-gene genetic programming (MGGP) approach and dynamic evolving neural-fuzzy inference system (DENFIS), in modeling monthly pan evaporation. The climatic data, namely, minimum temperature, maximum temperature, solar radiation, relative humidity, wind speed, and pan evaporation, obtained from Antakya and Antalya stations, Mediterranean Region of Turkey were utilized in the study. The MGGP and DENFIS methods were also compared with genetic programming (GP) and calibrated version of Hargreaves Samani (CHS) empirical method. For Antakya station, GP had slightly better accuracy than the MGGP and DENFIS models and all the data-driven models performed were superior to the CHS while the DENFIS provided better performance than the other models in modeling pan evaporation at Antalya station. The effect of periodicity input to the models' accuracy was also investigated and it was found that adding periodicity significantly increased the accuracy of MGGP and DENFIS models.

INTRODUCTION

Water is the one of the most crucial substances for all living things to maintain their lives. Although it is the most abundant substance on Earth, its usage and conservation have become very important. This has led sustainable water resource management to play a very important role for all existing things. Therefore, several fields of sciences, including hydrology, meteorology, oceanography, limnology, etc., have and are being developed. Among them, hydrology deals with the all forms of water that have great importance for humans and their environment. Therefore, hydrological modeling has been developed that includes solar radiation modeling, stream flow forecasting, rainfall–runoff modeling, and evapotranspiration estimation.

The hydrological cycle is the common point of all these branches of sciences. Evaporation as a basic component of the hydrological cycle is of vital importance in water resources planning and management and also in environmental work. The accurate calculation of evaporation is very important, especially in the arid and semi-arid areas (Malik & Kumar 2015). Direct and indirect approaches are generally used for computing and estimating evaporation. Pan evaporation is one of the direct approaches for evaporation calculation (Goyal & Ojha 2011). Indirect approaches are needed for prediction of pan evaporation because installing evaporation pans and measuring are highly expensive. By utilizing indirect approaches, pan evaporation can be predicted based on some climatic variables, such as relative humidity (RH), temperature, wind speed, and solar radiation (Kisi et al. 2016).

There are numerous practical issues (e.g., post-mining voids or deep lakes, farm dams or shallow lakes, studies investigating the water balance of a catchment, modeling rainfall–runoff, for areas of small irrigation or for the crops irrigated within a large irrigation district) that require the estimates of daily or monthly actual or potential evaporations. All these issues indicate that the daily or monthly evaporation calculations from climatic data or Class-A pan measurements are necessary in most of the practical issues (McMahon et al. 2013, 2016).

Over the last several decades many empirical and physical models have been utilized (Mustacchi et al. 1979; Yu et al. 2006; Han et al. 2007). Since these processes are nonlinear, complex, temporally and spatially varying, incidental, and unsteady, it is very difficult to model all the physical processes. Later, strong alternative modeling techniques known as artificial intelligence (AI)-based techniques were developed. These AI models showed that they performed better than traditional modeling techniques, especially dealing better with big noisy data.

In the last decades, soft computational approaches have been successfully utilized for prediction of pan evaporation (Sudheer et al. 2002; Kisi 2006, 2015a; Kim & Kim 2008; Shiri et al. 2011; Sanikhani et al. 2012; Lin et al. 2013; Malik & Kumar 2015; Keshtegar et al. 2016; Kisi et al. 2016). Sudheer et al. (2002) predicted daily pan evaporation of Dowleswaram in Andhra Pradesh, India by using artificial neural networks (ANNs) and showed that the ANNs could be used in modeling the evaporation process from the available weather variables. Kisi (2006) investigated the ability of neuro-fuzzy system in estimating pan evaporation of two automated weather stations, Arcata-Eureka and Daggett stations, California, operated by the US Environmental Protection Agency and compared it with ANN. He found the neuro-fuzzy system to be better than the ANN in modeling daily pan evaporation. Kim & Kim (2008) developed a generalized regression neural networks model with genetic algorithm for prediction of pan evaporation and the alfalfa reference evapotranspiration, in the Republic of Korea and obtained promising results. Shiri et al. (2011) used an adaptive neuro-fuzzy inference system (ANFIS) and ANN in prediction of daily pan evaporations of three weather stations, Illinois, USA based on air temperature, wind speed, solar radiation, RH, total rainfall, and surface soil temperature inputs. Sanikhani et al. (2012) applied two ANFIS systems in predicting daily pan evaporations of San Francisco and San Diego, in California, USA and compared them with the ANN method. ANFIS provided better estimates when compared to ANN. Lin et al. (2013) compared the support vector machine (SVM) and ANN in prediction of daily pan evaporation. According to their results, the SVM approach was more appropriate than the ANN with respect to accuracy and efficiency. Malik & Kumar (2015) predicted daily pan evaporation of Pantnagar, located in the foothills of the Himalayas in Uttarakhand state of India using ANN and co-active ANFIS and found ANN to be better than the latter. Kisi (2015a) used least square support vector machine (LSSVM), multivariate adaptive regression splines, and M5 model tree in prediction of pan evaporation and LSSVM was found to perform better than the others using local climatic data. Kisi et al. (2016) predicted pan evaporation of Ankara and Polatli stations, Turkey, using classification and regression tree, chi-squared automatic interaction detector, and ANNs and comparison of the methods indicated that the ANNs performed better than the others. Keshtegar et al. (2016) used conjugate gradient optimization method for daily pan evaporation prediction and compared it with ANFIS and model tree approaches. The conjugate gradient-based model performed better than the other models. Most of the approaches used in previous studies are based on black box methods in which their formulation is not explicit and cannot be easily used by practical applications. Therefore, in the present study, a multi-gene genetic programming (MGGP)-based model, which had explicit formulation and could be easily used in practice, was developed for prediction of pan evaporation.

There are limited studies in the literature related to the application of the dynamic evolving neural-fuzzy inference system (DENFIS) in water resources and hydrology (Heddam 2014; Heddam & Dechemi 2015; Kwin et al. 2016). DENFIS was applied for modeling dissolved oxygen concentration by Heddam (2014). The study demonstrated the superiority of the DENFIS over the multiple linear regression and ANN models. In a study by Heddam & Dechemi (2015), coagulant dosage in a water treatment plant in Algeria was modeled using DENFIS and satisfactorily results were obtained. In a study by Kwin et al. (2016), DENFIS was used for rainfall–runoff modeling and compared with the Hydrologic Engineering Center-Hydrologic Modeling System (HEC-HMS) and autoregressive model with exogenous inputs (ARX). It was found that DENFIS estimates were comparable to HEC-HMS and superior to the ARX model. To the best of our knowledge, there is no published study in the literature related to the application of DENFIS in modeling pan evaporation process.

The main aim of this study is: (1) to investigate the accuracy of MGGP and DENFIS approaches in prediction of pan evaporation; (2) to compare their accuracy with genetic programming (GP) and calibrated Hargreaves–Samani (HS) equation; (3) to investigate the effect of periodicity input to the models' accuracy.

METHODS

Genetic programming

GP is evaluated from computational techniques which are used for applying and solving different kinds of engineering problems (Kaydani et al. 2014). GP is adapted from the natural selection and generated extension of genetic algorithms. First, it was invented by Cramer (1985) and then developed by Koza (1992). Solutions in algorithms are computed by structured trees (Figure 1). Figure 1 illustrates an example of GP binary tree of the function (x/3 + (−y)). The tree has two types of nodes which are ‘functional’ and ‘terminals’. Functional nodes are used for arithmetical, logical, and other functions. Terminals are used for variables and constants (Durasevic et al. 2016).

Figure 1

An example of GP binary tree of function (x/3 + (−y)).

Figure 1

An example of GP binary tree of function (x/3 + (−y)).

GP algorithms have three genetics operations which are reproduction, mutation, and crossover. Genetic algorithms compute individuals and choose individuals for genetic operations. The initial population of individuals' outcomes makes a new set of individuals in a new generation. A new generation is produced instead of existing generation at reproduction operation. The branch of the tree is randomly selected from two individuals and swapped. Terminal or functional is selected from the trees and mutated. The goal is to get the best individual from the population with the best fitness value and resulting in GP.

Multi-gene genetic programming

In this method, the initial population achievement is measured on the training data based on fitness function minimized to get a better solution. Typically, the fitness function used is special root mean squared error (SRMSE), given by  
formula
(1)
where Gi is predicted value of ith data sample by MGGP, Ai is an actual value of ith sample of data, and N is number of training samples (Garg et al. 2014).

MGGP is derived from GP which is used for advancing empirical mathematical modeling. This model is a weighted number of GP trees combination of set genes and designed by regression genes which use the least squares method and estimation of coefficients S0, S1, S2, and S3. The model predicts output value depending on two input variables which are x1 and x2. A typical MGGP formulation mechanism is shown in Figure 2. In this figure, Gene 1, Gene 2, and Gene 3 express the functions cos(x1)/x2 + (−5), x2 + x1*2/(−5), and (x1−5)*(x2 + 3), respectively. Values of Gmax and dmax are specified by the user to produce control over the MGGP model complexity. The Gmax and dmax parameters influence the size and the number of models to be searched in the global space. Thus, there are ideal values of Gmax and dmax which generate a comparatively compact model (Searson et al. 2010).

Figure 2

Example of MGGP at least square method.

Figure 2

Example of MGGP at least square method.

The initial population in MGGP is built by creation of individuals which exist in GP trees with randomly selected different genes. During MGGP execution, genes are added or deleted by crossover operator which is described as a two-point high-level crossover which allows gene exchange between individuals.

A gene is randomly selected from each parent. Then, standard subtree crossover operator is applied and the comprised tree is swapped with parent trees. Probability of crossover operator is defined by the user (Gandomi & Alavi 2012). MGGP algorithm follows these steps:

  • Step 1: Define the problem.

  • Step 2: Set initial parameters (population size, maximum number of genes, generations, etc.)

  • Step 3: Build a model using least square method.

  • Step 4: Evaluate performance of models (based on SRMSE).

  • Step 5: Apply genetic operators and construct new population.

  • Step 6: Evaluate performance of individuals in new population.

  • Step 7: Finish the steps if termination criteria are supplied. Otherwise, go to step 5.

Detailed information about MGGP can be obtained from previous studies (Searson 2009; Searson et al. 2010).

Dynamic evolving neural-fuzzy inference system

DENFIS is a new type of fuzzy inference system proposed by Song & Kasabov (2000) for adaptive offline and online learning. DENFIS uses and improves dynamic features of evolving fuzzy neural network (EFuNN) so that DENFIS becomes suitable for online adaptive systems.

DENFIS model uses evolving clustering method (ECM) and evolving clustering method with constrained minimization (ECMc) for partitioning the input space. In DENFIS, the rules are created and updated in this partitioning. ECM is an online evolving fast clustering method that is based on maximum distance. By using the maximum distance between a point and a cluster center ECM estimates the number of clusters dynamically.

After clustering, Takagi–Sugeno fuzzy inference and rules are used in the DENFIS model. DENFIS uses a dynamically formed fuzzy inference system by using m highly activated fuzzy rules for calculating the output that depends on the position of the input vector in the input space. Not only for each new input vector are the fuzzy rules dynamically chosen from the current fuzzy rule set but also during the learning process fuzzy rules are dynamically created and updated.

In general, the input of the new data vector to the system may result in updating the existing rules and if a new cluster is formed a new fuzzy rule is created. By back-propagation method, generated rules are optimized. Choosing dynamically the most important rules derive final output for each prediction.

For an input vector , the result of inference, is calculated as the weighted average of each rule's output (Kasabov & Song 2002):  
formula
(2)
where ; i = 1, 2, … , n, j = 1, 2, … , m.

Application and results

The monthly maximum temperature (Tmax), minimum temperature (Tmin), solar radiation (Rs), wind speed (Ws), RH, and pan evaporation data from Antakya (latitude 36.33 °N, longitude 36.30 °E, altitude 100 m) and Antalya (latitude 36.89 °N, longitude 30.68 °E, altitude 47 m) stations located in the Mediterranean Region of Turkey were used in the present study. Related data were obtained from the Turkish State Meteorological Service (TSMO). More information about data measurements is available from the TSMO website (https://www.mgm.gov.tr/eng/forecast-5days.aspx?g=0). The climate of this region has cool, rainy winters and moderately dry and hot summers. Yearly rainfall ranges from 580 to 1,300 mm. For the Antakya and Antalya stations, 203 (from 1983 to 2010) and 362 (from 1967 to 2006) monthly data were available. It should be noted that an evaluation of data uncertainty and its effect on the results (Beven et al. 2008; Wang et al. 2015) were not considered in this study. For each station, the first 80% of the whole data was used for training and the remaining 20% was used for testing the obtained models. The brief statistical properties of the used data sets are reported in Table 1. In this table, the xmean, xmin, xmax, Sx, and Csx indicate the mean, minimum, maximum, standard deviation, and skewness, respectively. It is clear from the table that data of both stations have different distributions to each other. Pan evaporation ranges are 0.9–9.8 and 1.3–12.4 for the Antakya and Antalya stations, respectively, and data of the latter station show more skewed distribution (Csx =0.54).

Table 1

The brief statistical properties of the used data sets

Station Variable xmean xmin xmax Sx Csx 
Antakya Rs (Langley) 107 30 179 39.8 −0.25 
Tmax (°C) 24.1 10 32.2 5.67 −0.80 
Tmin (°C) 16.2 0.6 28.3 7.55 −0.27 
RH (%) 69.1 30.8 80.2 5.03 −0.49 
Ws (m/s) 3.45 1.3 7.1 1.26 0.67 
Pan evaporation (mm) 4.59 0.9 9.8 2.15 −0.03 
Antalya Rs (Langley) 119 33.3 215 47.3 −0.12 
Tmax (°C) 23.0 11.1 37.2 6.90 0.12 
Tmin (°C) 14.1 0.6 27.4 7.31 0.16 
RH (%) 63.3 42.2 80.4 6.98 −0.13 
Ws (m/s) 2.92 1.7 5.8 0.65 1.00 
Pan evaporation (mm) 5.31 1.3 12.4 2.61 0.54 
Station Variable xmean xmin xmax Sx Csx 
Antakya Rs (Langley) 107 30 179 39.8 −0.25 
Tmax (°C) 24.1 10 32.2 5.67 −0.80 
Tmin (°C) 16.2 0.6 28.3 7.55 −0.27 
RH (%) 69.1 30.8 80.2 5.03 −0.49 
Ws (m/s) 3.45 1.3 7.1 1.26 0.67 
Pan evaporation (mm) 4.59 0.9 9.8 2.15 −0.03 
Antalya Rs (Langley) 119 33.3 215 47.3 −0.12 
Tmax (°C) 23.0 11.1 37.2 6.90 0.12 
Tmin (°C) 14.1 0.6 27.4 7.31 0.16 
RH (%) 63.3 42.2 80.4 6.98 −0.13 
Ws (m/s) 2.92 1.7 5.8 0.65 1.00 
Pan evaporation (mm) 5.31 1.3 12.4 2.61 0.54 
Root mean square error (RMSE), mean absolute error (MAE), Nash–Sutcliffe efficiency (NSE) coefficient, and determination coefficient (R2) were used for evaluation of the applied models. The RMSE, MAE, and NSE can be expressed as:  
formula
(3)
 
formula
(4)
 
formula
(5)
where PEO,i and PEM,i are the observed and estimated pan evaporations, n is the number of time steps, is the mean of the observed pan evaporation.
The MGGP and DENFIS were applied for estimating pan evaporations and compared with GP and calibrated version of the HS empirical equation. The Hargreaves & Samani (1985) equation is given as:  
formula
(6)
where ET0 = reference evapotranspiration (mm day−1); Tmax and Tmin = maximum and minimum temperature (°C) and Ra = extraterrestrial radiation (mm day−1). In this study, Ra was calculated as described by Allen et al. (1998). Calibration of the HS equation was made for estimating pan evaporation by using the data used for the training of AI models. Rahimikhoob (2009) also previously used this equation for pan evaporation modeling. The optimal a, b, and c parameters given in Equation (7) were calculated by using genetic algorithm. Mutation rate, population size, and error were set to 0.075, 100, and 0.0001, respectively.  
formula
(7)
Thus, the calibrated HG (CHS) estimates were compared with the AI methods.

Similar parameters were set for the GP and MGGP methods and given in Table 2. Figure 3 illustrates the variation of the mean and best fitness with the generation number and the statistical characteristics of the MGGP model evolved (training data) for Antakya station. As clearly seen from Figure 3(a), the weight values of the bias term and Gene 2 are higher than those of the other genes. Figure 3(b) gives the significance degree of each gene in respect to p values. Low p values in this figure indicate that the contribution of the genes to explain variations in pan evaporation is very high. Also, Figure 4 demonstrates the populations of the models evolved in terms of their node numbers indicating complexity together with their fitness values. From this figure, the best model according to complexity (less complex) and population (less population) can be determined. A big black circle in the figure indicates the best model with respect to population. A light grey circle shows the models which are not strongly dominated by other models in the populations in terms of model complexity and fitness. The obtained MGGP equation is given in Table 3. For the DENFIS models, selection of distance threshold value (Dthr) is very important (Heddam & Dechemi 2015). In this study, various Dthr values were tried to decide the optimal one. The optimal Dthr value obtained for the DENFIS model is 0.02 for Antakya station.

Table 2

Parameters of the GP and MGGP models

Parameter Settings 
Function set +, −, x, /, √, exp, ln 
Population size 100 
Number of generations 100 
Maximum number of genes allowed in an individual 
Maximum tree depth 
Tournament size 12 
Elitism 0.01% of population 
Probability of GP tree mutation 0.1 
Probability of GP tree crossover 0.85 
Parameter Settings 
Function set +, −, x, /, √, exp, ln 
Population size 100 
Number of generations 100 
Maximum number of genes allowed in an individual 
Maximum tree depth 
Tournament size 12 
Elitism 0.01% of population 
Probability of GP tree mutation 0.1 
Probability of GP tree crossover 0.85 
Figure 3

(a) Variation of the best and mean fitness with the number of generations and (b) statistical properties of the evolved MGGP model (on training data) – Antakya station.

Figure 3

(a) Variation of the best and mean fitness with the number of generations and (b) statistical properties of the evolved MGGP model (on training data) – Antakya station.

Figure 4

Population of the MGGP models in terms of their complexity and fitness – Antakya station.

Figure 4

Population of the MGGP models in terms of their complexity and fitness – Antakya station.

Table 3

The equations of the optimal MGGP models

Antakya 
 
Antalya 
 
Antakya 
 
Antalya 
 

and indicate the maximum temperature, minimum temperature, solar radiation, wind speed, and relative humidity, respectively. iflte(A,B,C,D) means if A ≤ B then C else D on an element by element basis.

Training and test results of the optimal MGGP, GP, DENFIS, and CHS models are compared in Table 4. It is apparent from the results that the GP model performs slightly better than the MGGP and DENFIS models with respect to RMSE, MAE, NSE, and R2. Training results show that the MGGP model approximates pan evaporations better than the GP model. In the test stage, also, closer estimates are obtained for the MGGP, GP, and DENFIS. All the AI models have better accuracy than the CHS empirical equation. Figure 5 illustrates the scatterplot comparison of applied models in estimating pan evaporation. It is apparent that the MGGP provides less scattered estimates with a slope constant and bias value, respectively, closer to 1 and 0 than those of the GP and DENFIS.

Table 4

Comparison of statistical errors for MGGP, GP, DENFIS, and CHS in estimation of pan evaporation – Antakya station

Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
MGGP 0.712 0.514 0.898 0.910 0.674 0.533 0.837 0.948 
GP 0.790 0.594 0.875 0.875 0.644 0.495 0.851 0.903 
DENFIS 0.874 0.560 0.863 0.864 0.663 0.510 0.842 0.919 
CHS 3.107 2.490 −0.63 0.002 2.994 2.469 −0.65 0.493 
Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
MGGP 0.712 0.514 0.898 0.910 0.674 0.533 0.837 0.948 
GP 0.790 0.594 0.875 0.875 0.644 0.495 0.851 0.903 
DENFIS 0.874 0.560 0.863 0.864 0.663 0.510 0.842 0.919 
CHS 3.107 2.490 −0.63 0.002 2.994 2.469 −0.65 0.493 
Figure 5

Scatterplots of observed and predicted pan evaporations by GP, MGGP, DENFIS, and CHS models – Antakya station.

Figure 5

Scatterplots of observed and predicted pan evaporations by GP, MGGP, DENFIS, and CHS models – Antakya station.

The variation of the fitness vs generations and the statistical properties of the MGGP model in training are illustrated in Figure 6(a) for Antalya Station. From the figure, it is apparent that Gene 3 has the highest weight followed by the bias and Gene 7. The other genes have considerably lower weights. The significance degree of each gene is provided in Figure 6(b). Similar too is the Antakya station, here also the genes' contribution to explain variations in pan evaporation are significantly high, because of their corresponding very low p values. The populations of the models evolved are shown in Figure 7. As can be seen, the best model with respect to complexity and population has 46 nodes shown by a big black circle. Table 3 provides the equations of the optimal MGGP model for both stations.

Figure 6

(a) Variation of the best and mean fitness with the number of generations and (b) statistical properties of the evolved MGGP model (on training data) – Antalya station.

Figure 6

(a) Variation of the best and mean fitness with the number of generations and (b) statistical properties of the evolved MGGP model (on training data) – Antalya station.

Figure 7

Population of the MGGP models in terms of their complexity and fitness – Antalya station.

Figure 7

Population of the MGGP models in terms of their complexity and fitness – Antalya station.

Table 5 compares the MGGP, GP, DENFIS, and CHS models in estimating pan evaporation of Antalya station. From the results, it is clear that the DENFIS has the best accuracy in both training and test periods. CHS is also better than the GP in the test period with respect to RMSE and NSE statistics. Figure 8 demonstrates scatterplots for the test results of the optimal models. It is apparent from the graphs that the DENFIS performs better than the other models with respect to data dispersion.

Table 5

Comparison of statistical errors for MGGP, GP, DENFIS, and CHS in estimation of pan evaporation – Antalya station

Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
MGGP 1.311 1.016 0.736 0.931 2.243 1.869 0.347 0.929 
GP 0.727 0.564 0.919 0.919 1.374 1.116 0.755 0.945 
DENFIS 0.707 0.545 0.923 0.925 1.163 0.944 0.824 0.943 
CHS 1.000 5.014 0.846 0.864 1.320 4.767 0.775 0.925 
Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
MGGP 1.311 1.016 0.736 0.931 2.243 1.869 0.347 0.929 
GP 0.727 0.564 0.919 0.919 1.374 1.116 0.755 0.945 
DENFIS 0.707 0.545 0.923 0.925 1.163 0.944 0.824 0.943 
CHS 1.000 5.014 0.846 0.864 1.320 4.767 0.775 0.925 
Figure 8

Scatterplots of observed and predicted pan evaporations by GP, MGGP, DENFIS, and CHS models – Antalya station.

Figure 8

Scatterplots of observed and predicted pan evaporations by GP, MGGP, DENFIS, and CHS models – Antalya station.

Sanikhani & Kisi (2012) and Kisi (2015b) have examined the effect of periodicity in forecasting monthly streamflows by including periodicity component (month number) as input to the AI-based models and they have reported that it significantly increased the models' accuracy. Therefore, here, also the effect of periodicity on models' accuracy was investigated by adding month number of the year as input to the models for each data set. The same parameters were set for the MGGP and GP methods, as provided in Table 2. The comparison results of the periodic models are provided in Table 5 for Antakya station. It is apparent from the table that the periodic DENFIS model has better performance than the periodic MGGP and GP models. MGGP also performs better than the GP in estimation of pan evaporation. Comparison with Table 4 clearly indicates that adding periodicity input to the models obviously increases their performances in estimating pan evaporation. Periodic DENFIS model increases the RMSE, MAE, and NSE accuracy of the DENFIS by 24, 22, and 8%, respectively. The results are graphically compared in Figure 9. It can be seen from the fit line equation that the DENFIS estimates are closer to the exact line (y = x). Comparison with Figure 5 obviously indicates that the periodicity input increases the accuracy of DENFIS and GP methods while the MGGP accuracy decreases in estimating monthly pan evaporation. Table 6 compares the training and test results of the periodic models for Antalya station. Unlike the Antakya, here, periodic MGGP provides better accuracy than the GP and DENFIS models with respect to various statistics. Comparison with Table 5 obviously shows that including periodicity component as input considerably increases the accuracy of DENFIS and MGGP. Periodic MGGP model increases the RMSE, MAE, and NSE accuracy of the MGGP by 51, 53, and 144%, respectively. Periodic models' estimates are shown in Figure 10 for Antalya station.

Figure 9

Scatterplots of observed and predicted pan evaporations by periodic GP, MGGP, and DENFIS models – Antakya station.

Figure 9

Scatterplots of observed and predicted pan evaporations by periodic GP, MGGP, and DENFIS models – Antakya station.

Table 6

Comparison of statistical errors for periodic MGGP, GP, DENFIS, and HS in estimation of pan evaporation – Antakya station

Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
Periodic MGGP 0.905 0.654 0.835 0.889 0.574 0.444 0.882 0.899 
Periodic GP 0.804 0.595 0.870 0.870 0.611 0.523 0.866 0.952 
Periodic DENFIS 0.894 0.599 0.839 0.840 0.503 0.396 0.909 0.794 
Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
Periodic MGGP 0.905 0.654 0.835 0.889 0.574 0.444 0.882 0.899 
Periodic GP 0.804 0.595 0.870 0.870 0.611 0.523 0.866 0.952 
Periodic DENFIS 0.894 0.599 0.839 0.840 0.503 0.396 0.909 0.794 
Figure 10

Scatterplots of observed and predicted pan evaporations by periodic GP, MGGP, and DENFIS models – Antalya station.

Figure 10

Scatterplots of observed and predicted pan evaporations by periodic GP, MGGP, and DENFIS models – Antalya station.

In a study by Bruton et al. (2000), neural network models were used for estimating pan evaporation of Pome, Plains, and Watkinsville, Georgia using climatic data (e.g., temperature, rainfall, solar radiation, RH, and wind speed). R2 was calculated as 0.717 for the most accurate model. In a study by Dogan et al. (2007), feed forward neural networks (FFNNs) and radial basis neural networks (RBNNs) were applied for estimating pan evaporation of Lake Sapanca, Turkey using climatic data of min and max temperature, RH, wind speed, real solar period, and maximum solar period. R2 was computed as 0.651 and 0.716 for the best FFNNs and RBNNs models, respectively. From the tables (Tables 47), it is apparent that the obtained MGGP, GP, and DENFIS models provide accurate results in estimating pan evaporation process from the R2 viewpoint.

Table 7

Comparison of statistical errors for periodic MGGP, GP, DENFIS and HS in estimation of pan evaporation – Antalya station

Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
Periodic MGGP 0.571 0.437 0.950 0.950 1.089 0.885 0.846 0.958 
Periodic GP 0.695 0.543 0.926 0.926 1.386 1.139 0.751 0.950 
Periodic DENFIS 0.715 0.565 0.922 0.925 1.219 1.006 0.807 0.935 
Model Training
 
Test
 
RMSE MAE NSE R2 RMSE MAE NSE R2 
Periodic MGGP 0.571 0.437 0.950 0.950 1.089 0.885 0.846 0.958 
Periodic GP 0.695 0.543 0.926 0.926 1.386 1.139 0.751 0.950 
Periodic DENFIS 0.715 0.565 0.922 0.925 1.219 1.006 0.807 0.935 
The generalization capacity of the applied models with respect to number of parameters or weights was also investigated. For this, Akaike information criterion (AIC) defined by Akaike (1974) was used. This criterion is shown in the following equation:  
formula
(8)
where n is the number of samples in the testing set, MSE is mean square error, and k is the number of model parameters. This criterion can be successfully used for the evaluation of soft computing models together with their system size (Kisi & Guven 2010). In addition to RMSE, MAE, NSE, and R2, another new criterion was used for evaluation of the models. This criterion combines the RMSE, MAE, and R2 statistics and provides general evaluation of the applied models similar to the ideal point error (IPE) (Domínguez et al. 2011). This criterion can be seen in the following equation:  
formula
(9)
where CA is the combined accuracy. The AIC and CA of the applied models are reported in Table 8 for both stations. It is apparent from the table that the MGGP and GP have a lesser number of parameters compared to DENFIS. Therefore, their AIC are less than the latter method. It can be said that the DENFIS method has a more complex structure and more uncertainty exists for this method compared to the others. The MGGP, GP, and DENFIS models have less CA than the CHS in the test period. Adding periodicity component generally increases models' accuracy while an accuracy decrement is seen for the DENFIS method. The reason for this may be the high number of parameters which implies that the DENFIS has a highly complex structure.
Table 8

Test results of the MGGP, GP, DENFIS, and CHS with respect to CA and AIC in estimation of pan evaporation

Model Antakya
 
  Antalya
 
  MSE AIC CA MSE AIC CA 
MGGP 14 0.454 −4.40 0.415 5.031 133 1.380 
GP 0.415 −32.1 0.408 1.888 51.1 0.840 
DENFIS 138 0.440 242 0.414 148 1.353 317 0.714 
CHS 8.964 45.1 2.180 1.742 23.0 2.030 
Periodic MGGP 11 0.329 −23.5 0.369 12 1.186 36.1 0.665 
Periodic GP 0.373 −36.4 0.390 1.921 52.3 0.850 
Periodic DENFIS 161 0.253 266 0.711 139 1.486 306 0.756 
Model Antakya
 
  Antalya
 
  MSE AIC CA MSE AIC CA 
MGGP 14 0.454 −4.40 0.415 5.031 133 1.380 
GP 0.415 −32.1 0.408 1.888 51.1 0.840 
DENFIS 138 0.440 242 0.414 148 1.353 317 0.714 
CHS 8.964 45.1 2.180 1.742 23.0 2.030 
Periodic MGGP 11 0.329 −23.5 0.369 12 1.186 36.1 0.665 
Periodic GP 0.373 −36.4 0.390 1.921 52.3 0.850 
Periodic DENFIS 161 0.253 266 0.711 139 1.486 306 0.756 

CONCLUSION

The study compared the ability of MGGP and DENFIS in modeling pan evaporation and compared them with GP and CHS equation. The monthly maximum temperature, minimum temperature, solar radiation, wind speed, RH, and pan evaporation data from Antakya and Antalya stations, Mediterranean Region of Turkey were used in the applications. The influence of periodicity component on the models' prediction accuracy was also examined in the study. Involving periodicity in the inputs considerably improved the accuracy of DENFIS and MGGP models in Antakya and Antalya stations, respectively. The DENFIS model with periodic input performed superior to the periodic MGGP and GP models in Antakya station, while the periodic MGGP model provided better accuracy than the periodic DENFIS and GP models at Antalya station. Periodic DENFIS model decreased the RMSE of the periodic GP model from 0.611 mm to 0.503 mm at Antakya station. For Antalya, the RMSE of the periodic GP model was decreased from 1.386 mm to 1.089 mm using periodic MGGP. The models' complexity was also investigated by using AIC and it was seen that the DENFIS model has a highly complex structure and high number of parameters. The main advantage of the MGGP in addition to its high accuracy is that it has a very simple structure, and therefore, it could be easily utilized in practical applications. The CHS model provided the worst accuracy in both stations with respect to a new criterion which combines the RMSE, MAE, and R2 statistics. The obtained results in the two stations are different to each other, especially in terms of the CHS method. The reason for this may be the difference in statistical characteristics of the two stations. From these results it can be said that the models' accuracies cannot be generalized (cannot be extended to other study cases) and require more comparison using different data from different regions for justifying their generalization. The MGGP and DENFIS models may be incorporated as modules in general hydrological analysis models.

REFERENCES

REFERENCES
Akaike
H.
1974
A new look at the statistical model identification
.
IEEE Trans. Autom. Control
.
AC-19
,
716
723
.
Allen
R. G.
,
Pereira
L. S.
,
Raes
D.
&
Smith
M.
1998
Crop Evapotranspiration: Guidelines for Computing Crop Water Requirements. Irrigation and Drainage Paper No. 56
.
FAO
,
Rome
,
Italy
.
Beven
K.
,
Smith
P.
&
Freer
J.
2008
So just why would a modeller choose to be incoherent?
J. Hydrol.
354
,
15
32
.
Bruton
J. M.
,
McClendon
R. W.
&
Hoogenboom
G.
2000
Estimating daily pan evaporation with artificial neural networks
.
Trans. ASABE
43
(
2
),
491
496
.
Cramer
N.
1985
A representation for the adaptive generation of simple sequential programs
. In:
Proceedings of the First International Conference on Genetic Algorithms and Their Applications
(
Grefenstette
J. F.
, ed.).
Sheffield
,
UK
.
Dogan
E.
,
Isik
S.
&
Sandalci
M.
2007
Estimation of daily evaporation using artificial neural networks
.
Tek Dergi
18
(
2
),
4119
4131
.
Domínguez
E.
,
Dawson
C. W.
,
Ramirez
A.
&
Abrahart
R. J.
2011
The search for orthogonal hydrological modelling metrics: a case study of 20 monitoring stations in Colombia
.
J. Hydroinform.
13
,
429
442
.
Durasevic
M.
,
Jakobovic
D.
&
Knezevi
K.
2016
Adaptive scheduling on unrelated machines with genetic programming
.
Appl. Soft Comput.
48
,
419
430
.
Han
D.
,
Chan
L.
&
Zhu
N.
2007
Flood forecasting using support vector machines
.
J. Hydroinform.
9
(
4
),
267
276
.
Hargreaves
G. H.
&
Samani
Z. A.
1985
Reference crop evapotranspiration from temperature
.
Appl. Eng. Agric.
1
(
2
),
96
99
.
Heddam
S.
&
Dechemi
N.
2015
A new approach based on the dynamic evolving neural-fuzzy inference system (DENFIS) for modelling coagulant dosage (Dos): case study of water treatment plant of Algeria
.
Desalin. Water Treat.
53
(
4
),
1045
1053
.
Kaydani
H.
,
Najafzadeh
M.
&
Hajizade
A.
2014
A new correlation for calculating carbon dioxide minimum miscibility pressure based on multi-gene genetic programming
.
J. Natural Gas Sci. Eng.
21
,
625
630
.
Koza
J. R.
1992
Genetic Programming: On the Programming of Computers by Means of Natural Selection
.
MIT Press
,
Cambridge, MA
.
Kwin
C. T.
,
Talei
A.
,
Alaghmand
S.
&
Chua
L. H. C.
2016
Rainfall-runoff modeling using dynamic evolving neural fuzzy inference system with online learning
.
Proc. Eng.
154
,
1103
1109
.
Lin
G. F.
,
Lin
H. Y.
&
Wu
M. C.
2013
Development of a support-vector-machine-based model for daily pan evaporation estimation
.
Hydrol. Process.
27
(
22
),
3115
3127
.
McMahon
T. A.
,
Peel
M. C.
,
Lowe
L.
,
Srikanthan
R.
&
McVicar
T. R.
2013
Estimating actual, potential, reference crop and pan evaporation using standard meteorological data: a pragmatic synthesis
.
Hydrol. Earth Syst. Sci.
17
,
1331
1363
.
McMahon
T. A.
,
Finlayson
B. L.
&
Peel
M. C.
2016
Historical developments of models for estimating evaporation using standard meteorological data
.
Wiley Interdiscip. Rev. Water
.
doi:10.1002/wat2.1172
.
Mustacchi
C.
,
Cena
V.
&
Rocchi
M.
1979
Stochastic simulation of hourly global radiation sequences
.
Solar Energy
23
(
1
),
47
51
.
Sanikhani
H.
,
Kisi
O.
,
Nikpour
M. R.
&
Dinpashoh
Y.
2012
Estimation of daily pan evaporation using two different adaptive neuro-fuzzy computing techniques
.
Water Resour. Manage.
26
(
15
),
4347
4365
.
Searson
D. P.
2009
GPTIPS: Genetic Programming & Symbolic Regression for MATLAB
.
Searson
D. P.
,
Leahy
D. E.
&
Willis
M. J.
2010
GPTIPS: an open source genetic programming toolbox from multi-gene symbolic regression
. In:
International Multi Conference of Engineers and Computer Scientist
,
Hong Kong
,
1
, pp.
77
80
.
Song
Q.
&
Kasabov
N.
2000
Dynamic evolving neural-fuzzy inference system (DENFIS): on-line learning and application for time-series prediction
. In:
Proceedings of the 6th International Conference on Soft Computing
.
Iizuka, Fukuoka
,
Japan
, pp.
696
701
.
Sudheer
K. P.
,
Gosain
A. K.
,
Rangan
D. M.
&
Saheb
S. M.
2002
Modelling evaporation using an artificial neural network algorithm
.
Hydrol. Process.
16
,
3189
3202
.
Wang
D.
,
Ding
H.
,
Singh
V. P.
,
Shang
X. S.
,
Liu
D.
,
Wang
Y.
,
Zeng
X.
,
Wu
J.
,
Wang
L.
&
Zou
X.
2015
A hybrid wavelet analysis–cloud model data-extending approach for meteorologic and hydrologic time series
.
J. Geophys. Res.
120
,
4057
4071
.
Yu
P.-S.
,
Chen
S.-T.
&
Chang
I.-F.
2006
Support vector regression for real-time flood stage forecasting
.
J. Hydrol.
328
(
3–4
),
704
716
.