## Abstract

Accurately modeling pan evaporation is important in water resources planning and management and also in environmental engineering. This study compares the accuracy of two new data-driven methods, multi-gene genetic programming (MGGP) approach and dynamic evolving neural-fuzzy inference system (DENFIS), in modeling monthly pan evaporation. The climatic data, namely, minimum temperature, maximum temperature, solar radiation, relative humidity, wind speed, and pan evaporation, obtained from Antakya and Antalya stations, Mediterranean Region of Turkey were utilized in the study. The MGGP and DENFIS methods were also compared with genetic programming (GP) and calibrated version of Hargreaves Samani (CHS) empirical method. For Antakya station, GP had slightly better accuracy than the MGGP and DENFIS models and all the data-driven models performed were superior to the CHS while the DENFIS provided better performance than the other models in modeling pan evaporation at Antalya station. The effect of periodicity input to the models' accuracy was also investigated and it was found that adding periodicity significantly increased the accuracy of MGGP and DENFIS models.

## INTRODUCTION

Water is the one of the most crucial substances for all living things to maintain their lives. Although it is the most abundant substance on Earth, its usage and conservation have become very important. This has led sustainable water resource management to play a very important role for all existing things. Therefore, several fields of sciences, including hydrology, meteorology, oceanography, limnology, etc., have and are being developed. Among them, hydrology deals with the all forms of water that have great importance for humans and their environment. Therefore, hydrological modeling has been developed that includes solar radiation modeling, stream flow forecasting, rainfall–runoff modeling, and evapotranspiration estimation.

The hydrological cycle is the common point of all these branches of sciences. Evaporation as a basic component of the hydrological cycle is of vital importance in water resources planning and management and also in environmental work. The accurate calculation of evaporation is very important, especially in the arid and semi-arid areas (Malik & Kumar 2015). Direct and indirect approaches are generally used for computing and estimating evaporation. Pan evaporation is one of the direct approaches for evaporation calculation (Goyal & Ojha 2011). Indirect approaches are needed for prediction of pan evaporation because installing evaporation pans and measuring are highly expensive. By utilizing indirect approaches, pan evaporation can be predicted based on some climatic variables, such as relative humidity (RH), temperature, wind speed, and solar radiation (Kisi *et al.* 2016).

There are numerous practical issues (e.g., post-mining voids or deep lakes, farm dams or shallow lakes, studies investigating the water balance of a catchment, modeling rainfall–runoff, for areas of small irrigation or for the crops irrigated within a large irrigation district) that require the estimates of daily or monthly actual or potential evaporations. All these issues indicate that the daily or monthly evaporation calculations from climatic data or Class-A pan measurements are necessary in most of the practical issues (McMahon *et al.* 2013, 2016).

Over the last several decades many empirical and physical models have been utilized (Mustacchi *et al.* 1979; Yu *et al.* 2006; Han *et al.* 2007). Since these processes are nonlinear, complex, temporally and spatially varying, incidental, and unsteady, it is very difficult to model all the physical processes. Later, strong alternative modeling techniques known as artificial intelligence (AI)-based techniques were developed. These AI models showed that they performed better than traditional modeling techniques, especially dealing better with big noisy data.

In the last decades, soft computational approaches have been successfully utilized for prediction of pan evaporation (Sudheer *et al.* 2002; Kisi 2006, 2015a; Kim & Kim 2008; Shiri *et al.* 2011; Sanikhani *et al.* 2012; Lin *et al.* 2013; Malik & Kumar 2015; Keshtegar *et al.* 2016; Kisi *et al.* 2016). Sudheer *et al.* (2002) predicted daily pan evaporation of Dowleswaram in Andhra Pradesh, India by using artificial neural networks (ANNs) and showed that the ANNs could be used in modeling the evaporation process from the available weather variables. Kisi (2006) investigated the ability of neuro-fuzzy system in estimating pan evaporation of two automated weather stations, Arcata-Eureka and Daggett stations, California, operated by the US Environmental Protection Agency and compared it with ANN. He found the neuro-fuzzy system to be better than the ANN in modeling daily pan evaporation. Kim & Kim (2008) developed a generalized regression neural networks model with genetic algorithm for prediction of pan evaporation and the alfalfa reference evapotranspiration, in the Republic of Korea and obtained promising results. Shiri *et al.* (2011) used an adaptive neuro-fuzzy inference system (ANFIS) and ANN in prediction of daily pan evaporations of three weather stations, Illinois, USA based on air temperature, wind speed, solar radiation, RH, total rainfall, and surface soil temperature inputs. Sanikhani *et al.* (2012) applied two ANFIS systems in predicting daily pan evaporations of San Francisco and San Diego, in California, USA and compared them with the ANN method. ANFIS provided better estimates when compared to ANN. Lin *et al.* (2013) compared the support vector machine (SVM) and ANN in prediction of daily pan evaporation. According to their results, the SVM approach was more appropriate than the ANN with respect to accuracy and efficiency. Malik & Kumar (2015) predicted daily pan evaporation of Pantnagar, located in the foothills of the Himalayas in Uttarakhand state of India using ANN and co-active ANFIS and found ANN to be better than the latter. Kisi (2015a) used least square support vector machine (LSSVM), multivariate adaptive regression splines, and M5 model tree in prediction of pan evaporation and LSSVM was found to perform better than the others using local climatic data. Kisi *et al.* (2016) predicted pan evaporation of Ankara and Polatli stations, Turkey, using classification and regression tree, chi-squared automatic interaction detector, and ANNs and comparison of the methods indicated that the ANNs performed better than the others. Keshtegar *et al.* (2016) used conjugate gradient optimization method for daily pan evaporation prediction and compared it with ANFIS and model tree approaches. The conjugate gradient-based model performed better than the other models. Most of the approaches used in previous studies are based on black box methods in which their formulation is not explicit and cannot be easily used by practical applications. Therefore, in the present study, a multi-gene genetic programming (MGGP)-based model, which had explicit formulation and could be easily used in practice, was developed for prediction of pan evaporation.

There are limited studies in the literature related to the application of the dynamic evolving neural-fuzzy inference system (DENFIS) in water resources and hydrology (Heddam 2014; Heddam & Dechemi 2015; Kwin *et al.* 2016). DENFIS was applied for modeling dissolved oxygen concentration by Heddam (2014). The study demonstrated the superiority of the DENFIS over the multiple linear regression and ANN models. In a study by Heddam & Dechemi (2015), coagulant dosage in a water treatment plant in Algeria was modeled using DENFIS and satisfactorily results were obtained. In a study by Kwin *et al.* (2016), DENFIS was used for rainfall–runoff modeling and compared with the Hydrologic Engineering Center-Hydrologic Modeling System (HEC-HMS) and autoregressive model with exogenous inputs (ARX). It was found that DENFIS estimates were comparable to HEC-HMS and superior to the ARX model. To the best of our knowledge, there is no published study in the literature related to the application of DENFIS in modeling pan evaporation process.

The main aim of this study is: (1) to investigate the accuracy of MGGP and DENFIS approaches in prediction of pan evaporation; (2) to compare their accuracy with genetic programming (GP) and calibrated Hargreaves–Samani (HS) equation; (3) to investigate the effect of periodicity input to the models' accuracy.

## METHODS

### Genetic programming

GP is evaluated from computational techniques which are used for applying and solving different kinds of engineering problems (Kaydani *et al.* 2014). GP is adapted from the natural selection and generated extension of genetic algorithms. First, it was invented by Cramer (1985) and then developed by Koza (1992). Solutions in algorithms are computed by structured trees (Figure 1). Figure 1 illustrates an example of GP binary tree of the function (x/3 + (−y)). The tree has two types of nodes which are ‘functional’ and ‘terminals’. Functional nodes are used for arithmetical, logical, and other functions. Terminals are used for variables and constants (Durasevic *et al.* 2016).

GP algorithms have three genetics operations which are reproduction, mutation, and crossover. Genetic algorithms compute individuals and choose individuals for genetic operations. The initial population of individuals' outcomes makes a new set of individuals in a new generation. A new generation is produced instead of existing generation at reproduction operation. The branch of the tree is randomly selected from two individuals and swapped. Terminal or functional is selected from the trees and mutated. The goal is to get the best individual from the population with the best fitness value and resulting in GP.

### Multi-gene genetic programming

*G*is predicted value of

_{i}*i*

^{th}data sample by MGGP,

*A*is an actual value of

_{i}*i*

^{th}sample of data, and

*N*is number of training samples (Garg

*et al.*2014).

MGGP is derived from GP which is used for advancing empirical mathematical modeling. This model is a weighted number of GP trees combination of set genes and designed by regression genes which use the least squares method and estimation of coefficients S_{0}, S_{1}, S_{2}, and S_{3}. The model predicts output value depending on two input variables which are x_{1} and x_{2}. A typical MGGP formulation mechanism is shown in Figure 2. In this figure, Gene 1, Gene 2, and Gene 3 express the functions cos(x1)/x2 + (−5), x2 + x1*2/(−5), and (x1−5)*(x2 + 3), respectively. Values of G_{max} and d_{max} are specified by the user to produce control over the MGGP model complexity. The G_{max} and d_{max} parameters influence the size and the number of models to be searched in the global space. Thus, there are ideal values of G_{max} and d_{max} which generate a comparatively compact model (Searson *et al.* 2010).

The initial population in MGGP is built by creation of individuals which exist in GP trees with randomly selected different genes. During MGGP execution, genes are added or deleted by crossover operator which is described as a two-point high-level crossover which allows gene exchange between individuals.

A gene is randomly selected from each parent. Then, standard subtree crossover operator is applied and the comprised tree is swapped with parent trees. Probability of crossover operator is defined by the user (Gandomi & Alavi 2012). MGGP algorithm follows these steps:

Step 1: Define the problem.

Step 2: Set initial parameters (population size, maximum number of genes, generations, etc.)

Step 3: Build a model using least square method.

Step 4: Evaluate performance of models (based on SRMSE).

Step 5: Apply genetic operators and construct new population.

Step 6: Evaluate performance of individuals in new population.

Step 7: Finish the steps if termination criteria are supplied. Otherwise, go to step 5.

Detailed information about MGGP can be obtained from previous studies (Searson 2009; Searson *et al.* 2010).

### Dynamic evolving neural-fuzzy inference system

DENFIS is a new type of fuzzy inference system proposed by Song & Kasabov (2000) for adaptive offline and online learning. DENFIS uses and improves dynamic features of evolving fuzzy neural network (EFuNN) so that DENFIS becomes suitable for online adaptive systems.

DENFIS model uses evolving clustering method (ECM) and evolving clustering method with constrained minimization (ECMc) for partitioning the input space. In DENFIS, the rules are created and updated in this partitioning. ECM is an online evolving fast clustering method that is based on maximum distance. By using the maximum distance between a point and a cluster center ECM estimates the number of clusters dynamically.

After clustering, Takagi–Sugeno fuzzy inference and rules are used in the DENFIS model. DENFIS uses a dynamically formed fuzzy inference system by using *m* highly activated fuzzy rules for calculating the output that depends on the position of the input vector in the input space. Not only for each new input vector are the fuzzy rules dynamically chosen from the current fuzzy rule set but also during the learning process fuzzy rules are dynamically created and updated.

In general, the input of the new data vector to the system may result in updating the existing rules and if a new cluster is formed a new fuzzy rule is created. By back-propagation method, generated rules are optimized. Choosing dynamically the most important rules derive final output for each prediction.

*i*= 1, 2, … ,

*n*,

*j*= 1, 2, … ,

*m*.

### Application and results

The monthly maximum temperature (T_{max}), minimum temperature (T_{min}), solar radiation (R_{s}), wind speed (W_{s}), RH, and pan evaporation data from Antakya (latitude 36.33 °N, longitude 36.30 °E, altitude 100 m) and Antalya (latitude 36.89 °N, longitude 30.68 °E, altitude 47 m) stations located in the Mediterranean Region of Turkey were used in the present study. Related data were obtained from the Turkish State Meteorological Service (TSMO). More information about data measurements is available from the TSMO website (https://www.mgm.gov.tr/eng/forecast-5days.aspx?g=0). The climate of this region has cool, rainy winters and moderately dry and hot summers. Yearly rainfall ranges from 580 to 1,300 mm. For the Antakya and Antalya stations, 203 (from 1983 to 2010) and 362 (from 1967 to 2006) monthly data were available. It should be noted that an evaluation of data uncertainty and its effect on the results (Beven *et al.* 2008; Wang *et al.* 2015) were not considered in this study. For each station, the first 80% of the whole data was used for training and the remaining 20% was used for testing the obtained models. The brief statistical properties of the used data sets are reported in Table 1. In this table, the x_{mean}, x_{min}, x_{max}, S_{x}, and C_{sx} indicate the mean, minimum, maximum, standard deviation, and skewness, respectively. It is clear from the table that data of both stations have different distributions to each other. Pan evaporation ranges are 0.9–9.8 and 1.3–12.4 for the Antakya and Antalya stations, respectively, and data of the latter station show more skewed distribution (C_{sx} =0.54).

Station . | Variable . | x_{mean}
. | x_{min}
. | x_{max}
. | S_{x}
. | C_{sx}
. |
---|---|---|---|---|---|---|

Antakya | R_{s} (Langley) | 107 | 30 | 179 | 39.8 | −0.25 |

T_{max} (°C) | 24.1 | 10 | 32.2 | 5.67 | −0.80 | |

T_{min} (°C) | 16.2 | 0.6 | 28.3 | 7.55 | −0.27 | |

RH (%) | 69.1 | 30.8 | 80.2 | 5.03 | −0.49 | |

W_{s} (m/s) | 3.45 | 1.3 | 7.1 | 1.26 | 0.67 | |

Pan evaporation (mm) | 4.59 | 0.9 | 9.8 | 2.15 | −0.03 | |

Antalya | R_{s} (Langley) | 119 | 33.3 | 215 | 47.3 | −0.12 |

T_{max} (°C) | 23.0 | 11.1 | 37.2 | 6.90 | 0.12 | |

T_{min} (°C) | 14.1 | 0.6 | 27.4 | 7.31 | 0.16 | |

RH (%) | 63.3 | 42.2 | 80.4 | 6.98 | −0.13 | |

W_{s} (m/s) | 2.92 | 1.7 | 5.8 | 0.65 | 1.00 | |

Pan evaporation (mm) | 5.31 | 1.3 | 12.4 | 2.61 | 0.54 |

Station . | Variable . | x_{mean}
. | x_{min}
. | x_{max}
. | S_{x}
. | C_{sx}
. |
---|---|---|---|---|---|---|

Antakya | R_{s} (Langley) | 107 | 30 | 179 | 39.8 | −0.25 |

T_{max} (°C) | 24.1 | 10 | 32.2 | 5.67 | −0.80 | |

T_{min} (°C) | 16.2 | 0.6 | 28.3 | 7.55 | −0.27 | |

RH (%) | 69.1 | 30.8 | 80.2 | 5.03 | −0.49 | |

W_{s} (m/s) | 3.45 | 1.3 | 7.1 | 1.26 | 0.67 | |

Pan evaporation (mm) | 4.59 | 0.9 | 9.8 | 2.15 | −0.03 | |

Antalya | R_{s} (Langley) | 119 | 33.3 | 215 | 47.3 | −0.12 |

T_{max} (°C) | 23.0 | 11.1 | 37.2 | 6.90 | 0.12 | |

T_{min} (°C) | 14.1 | 0.6 | 27.4 | 7.31 | 0.16 | |

RH (%) | 63.3 | 42.2 | 80.4 | 6.98 | −0.13 | |

W_{s} (m/s) | 2.92 | 1.7 | 5.8 | 0.65 | 1.00 | |

Pan evaporation (mm) | 5.31 | 1.3 | 12.4 | 2.61 | 0.54 |

^{2}) were used for evaluation of the applied models. The RMSE, MAE, and NSE can be expressed as: where

*PE*and

_{O,i}*PE*are the observed and estimated pan evaporations,

_{M,i}*n*is the number of time steps, is the mean of the observed pan evaporation.

*ET*= reference evapotranspiration (mm day

_{0}^{−1});

*T*and

_{max}*T*= maximum and minimum temperature (°C) and

_{min}*R*= extraterrestrial radiation (mm day

_{a}^{−1}). In this study,

*R*was calculated as described by Allen

_{a}*et al.*(1998). Calibration of the HS equation was made for estimating pan evaporation by using the data used for the training of AI models. Rahimikhoob (2009) also previously used this equation for pan evaporation modeling. The optimal

*a*,

*b*, and

*c*parameters given in Equation (7) were calculated by using genetic algorithm. Mutation rate, population size, and error were set to 0.075, 100, and 0.0001, respectively. Thus, the calibrated HG (CHS) estimates were compared with the AI methods.

Similar parameters were set for the GP and MGGP methods and given in Table 2. Figure 3 illustrates the variation of the mean and best fitness with the generation number and the statistical characteristics of the MGGP model evolved (training data) for Antakya station. As clearly seen from Figure 3(a), the weight values of the bias term and Gene 2 are higher than those of the other genes. Figure 3(b) gives the significance degree of each gene in respect to p values. Low p values in this figure indicate that the contribution of the genes to explain variations in pan evaporation is very high. Also, Figure 4 demonstrates the populations of the models evolved in terms of their node numbers indicating complexity together with their fitness values. From this figure, the best model according to complexity (less complex) and population (less population) can be determined. A big black circle in the figure indicates the best model with respect to population. A light grey circle shows the models which are not strongly dominated by other models in the populations in terms of model complexity and fitness. The obtained MGGP equation is given in Table 3. For the DENFIS models, selection of distance threshold value (Dthr) is very important (Heddam & Dechemi 2015). In this study, various Dthr values were tried to decide the optimal one. The optimal Dthr value obtained for the DENFIS model is 0.02 for Antakya station.

Parameter . | Settings . |
---|---|

Function set | +, −, x, /, √, exp, ln |

Population size | 100 |

Number of generations | 100 |

Maximum number of genes allowed in an individual | 8 |

Maximum tree depth | 4 |

Tournament size | 12 |

Elitism | 0.01% of population |

Probability of GP tree mutation | 0.1 |

Probability of GP tree crossover | 0.85 |

Parameter . | Settings . |
---|---|

Function set | +, −, x, /, √, exp, ln |

Population size | 100 |

Number of generations | 100 |

Maximum number of genes allowed in an individual | 8 |

Maximum tree depth | 4 |

Tournament size | 12 |

Elitism | 0.01% of population |

Probability of GP tree mutation | 0.1 |

Probability of GP tree crossover | 0.85 |

Antakya |

Antalya |

Antakya |

Antalya |

and indicate the maximum temperature, minimum temperature, solar radiation, wind speed, and relative humidity, respectively. *iflte*(A,B,C,D) means if A ≤ B then C else D on an element by element basis.

Training and test results of the optimal MGGP, GP, DENFIS, and CHS models are compared in Table 4. It is apparent from the results that the GP model performs slightly better than the MGGP and DENFIS models with respect to RMSE, MAE, NSE, and R^{2}. Training results show that the MGGP model approximates pan evaporations better than the GP model. In the test stage, also, closer estimates are obtained for the MGGP, GP, and DENFIS. All the AI models have better accuracy than the CHS empirical equation. Figure 5 illustrates the scatterplot comparison of applied models in estimating pan evaporation. It is apparent that the MGGP provides less scattered estimates with a slope constant and bias value, respectively, closer to 1 and 0 than those of the GP and DENFIS.

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

MGGP | 0.712 | 0.514 | 0.898 | 0.910 | 0.674 | 0.533 | 0.837 | 0.948 |

GP | 0.790 | 0.594 | 0.875 | 0.875 | 0.644 | 0.495 | 0.851 | 0.903 |

DENFIS | 0.874 | 0.560 | 0.863 | 0.864 | 0.663 | 0.510 | 0.842 | 0.919 |

CHS | 3.107 | 2.490 | −0.63 | 0.002 | 2.994 | 2.469 | −0.65 | 0.493 |

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

MGGP | 0.712 | 0.514 | 0.898 | 0.910 | 0.674 | 0.533 | 0.837 | 0.948 |

GP | 0.790 | 0.594 | 0.875 | 0.875 | 0.644 | 0.495 | 0.851 | 0.903 |

DENFIS | 0.874 | 0.560 | 0.863 | 0.864 | 0.663 | 0.510 | 0.842 | 0.919 |

CHS | 3.107 | 2.490 | −0.63 | 0.002 | 2.994 | 2.469 | −0.65 | 0.493 |

The variation of the fitness vs generations and the statistical properties of the MGGP model in training are illustrated in Figure 6(a) for Antalya Station. From the figure, it is apparent that Gene 3 has the highest weight followed by the bias and Gene 7. The other genes have considerably lower weights. The significance degree of each gene is provided in Figure 6(b). Similar too is the Antakya station, here also the genes' contribution to explain variations in pan evaporation are significantly high, because of their corresponding very low p values. The populations of the models evolved are shown in Figure 7. As can be seen, the best model with respect to complexity and population has 46 nodes shown by a big black circle. Table 3 provides the equations of the optimal MGGP model for both stations.

Table 5 compares the MGGP, GP, DENFIS, and CHS models in estimating pan evaporation of Antalya station. From the results, it is clear that the DENFIS has the best accuracy in both training and test periods. CHS is also better than the GP in the test period with respect to RMSE and NSE statistics. Figure 8 demonstrates scatterplots for the test results of the optimal models. It is apparent from the graphs that the DENFIS performs better than the other models with respect to data dispersion.

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

MGGP | 1.311 | 1.016 | 0.736 | 0.931 | 2.243 | 1.869 | 0.347 | 0.929 |

GP | 0.727 | 0.564 | 0.919 | 0.919 | 1.374 | 1.116 | 0.755 | 0.945 |

DENFIS | 0.707 | 0.545 | 0.923 | 0.925 | 1.163 | 0.944 | 0.824 | 0.943 |

CHS | 1.000 | 5.014 | 0.846 | 0.864 | 1.320 | 4.767 | 0.775 | 0.925 |

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

MGGP | 1.311 | 1.016 | 0.736 | 0.931 | 2.243 | 1.869 | 0.347 | 0.929 |

GP | 0.727 | 0.564 | 0.919 | 0.919 | 1.374 | 1.116 | 0.755 | 0.945 |

DENFIS | 0.707 | 0.545 | 0.923 | 0.925 | 1.163 | 0.944 | 0.824 | 0.943 |

CHS | 1.000 | 5.014 | 0.846 | 0.864 | 1.320 | 4.767 | 0.775 | 0.925 |

Sanikhani & Kisi (2012) and Kisi (2015b) have examined the effect of periodicity in forecasting monthly streamflows by including periodicity component (month number) as input to the AI-based models and they have reported that it significantly increased the models' accuracy. Therefore, here, also the effect of periodicity on models' accuracy was investigated by adding month number of the year as input to the models for each data set. The same parameters were set for the MGGP and GP methods, as provided in Table 2. The comparison results of the periodic models are provided in Table 5 for Antakya station. It is apparent from the table that the periodic DENFIS model has better performance than the periodic MGGP and GP models. MGGP also performs better than the GP in estimation of pan evaporation. Comparison with Table 4 clearly indicates that adding periodicity input to the models obviously increases their performances in estimating pan evaporation. Periodic DENFIS model increases the RMSE, MAE, and NSE accuracy of the DENFIS by 24, 22, and 8%, respectively. The results are graphically compared in Figure 9. It can be seen from the fit line equation that the DENFIS estimates are closer to the exact line (*y* = *x*). Comparison with Figure 5 obviously indicates that the periodicity input increases the accuracy of DENFIS and GP methods while the MGGP accuracy decreases in estimating monthly pan evaporation. Table 6 compares the training and test results of the periodic models for Antalya station. Unlike the Antakya, here, periodic MGGP provides better accuracy than the GP and DENFIS models with respect to various statistics. Comparison with Table 5 obviously shows that including periodicity component as input considerably increases the accuracy of DENFIS and MGGP. Periodic MGGP model increases the RMSE, MAE, and NSE accuracy of the MGGP by 51, 53, and 144%, respectively. Periodic models' estimates are shown in Figure 10 for Antalya station.

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

Periodic MGGP | 0.905 | 0.654 | 0.835 | 0.889 | 0.574 | 0.444 | 0.882 | 0.899 |

Periodic GP | 0.804 | 0.595 | 0.870 | 0.870 | 0.611 | 0.523 | 0.866 | 0.952 |

Periodic DENFIS | 0.894 | 0.599 | 0.839 | 0.840 | 0.503 | 0.396 | 0.909 | 0.794 |

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

Periodic MGGP | 0.905 | 0.654 | 0.835 | 0.889 | 0.574 | 0.444 | 0.882 | 0.899 |

Periodic GP | 0.804 | 0.595 | 0.870 | 0.870 | 0.611 | 0.523 | 0.866 | 0.952 |

Periodic DENFIS | 0.894 | 0.599 | 0.839 | 0.840 | 0.503 | 0.396 | 0.909 | 0.794 |

In a study by Bruton *et al.* (2000), neural network models were used for estimating pan evaporation of Pome, Plains, and Watkinsville, Georgia using climatic data (e.g., temperature, rainfall, solar radiation, RH, and wind speed). R^{2} was calculated as 0.717 for the most accurate model. In a study by Dogan *et al.* (2007), feed forward neural networks (FFNNs) and radial basis neural networks (RBNNs) were applied for estimating pan evaporation of Lake Sapanca, Turkey using climatic data of min and max temperature, RH, wind speed, real solar period, and maximum solar period. R^{2} was computed as 0.651 and 0.716 for the best FFNNs and RBNNs models, respectively. From the tables (Tables 4–7), it is apparent that the obtained MGGP, GP, and DENFIS models provide accurate results in estimating pan evaporation process from the R^{2} viewpoint.

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

Periodic MGGP | 0.571 | 0.437 | 0.950 | 0.950 | 1.089 | 0.885 | 0.846 | 0.958 |

Periodic GP | 0.695 | 0.543 | 0.926 | 0.926 | 1.386 | 1.139 | 0.751 | 0.950 |

Periodic DENFIS | 0.715 | 0.565 | 0.922 | 0.925 | 1.219 | 1.006 | 0.807 | 0.935 |

Model . | Training . | Test . | ||||||
---|---|---|---|---|---|---|---|---|

RMSE . | MAE . | NSE . | R^{2}
. | RMSE . | MAE . | NSE . | R^{2}
. | |

Periodic MGGP | 0.571 | 0.437 | 0.950 | 0.950 | 1.089 | 0.885 | 0.846 | 0.958 |

Periodic GP | 0.695 | 0.543 | 0.926 | 0.926 | 1.386 | 1.139 | 0.751 | 0.950 |

Periodic DENFIS | 0.715 | 0.565 | 0.922 | 0.925 | 1.219 | 1.006 | 0.807 | 0.935 |

*n*is the number of samples in the testing set,

*MSE*is mean square error, and

*k*is the number of model parameters. This criterion can be successfully used for the evaluation of soft computing models together with their system size (Kisi & Guven 2010). In addition to RMSE, MAE, NSE, and R

^{2}, another new criterion was used for evaluation of the models. This criterion combines the RMSE, MAE, and R

^{2}statistics and provides general evaluation of the applied models similar to the ideal point error (IPE) (Domínguez

*et al.*2011). This criterion can be seen in the following equation: where

*CA*is the combined accuracy. The AIC and CA of the applied models are reported in Table 8 for both stations. It is apparent from the table that the MGGP and GP have a lesser number of parameters compared to DENFIS. Therefore, their AIC are less than the latter method. It can be said that the DENFIS method has a more complex structure and more uncertainty exists for this method compared to the others. The MGGP, GP, and DENFIS models have less CA than the CHS in the test period. Adding periodicity component generally increases models' accuracy while an accuracy decrement is seen for the DENFIS method. The reason for this may be the high number of parameters which implies that the DENFIS has a highly complex structure.

Model . | k . | Antakya . | . | Antalya . | ||||
---|---|---|---|---|---|---|---|---|

. | MSE . | AIC . | CA . | k . | MSE . | AIC . | CA . | |

MGGP | 14 | 0.454 | −4.40 | 0.415 | 9 | 5.031 | 133 | 1.380 |

GP | 2 | 0.415 | −32.1 | 0.408 | 3 | 1.888 | 51.1 | 0.840 |

DENFIS | 138 | 0.440 | 242 | 0.414 | 148 | 1.353 | 317 | 0.714 |

CHS | 3 | 8.964 | 45.1 | 2.180 | 3 | 1.742 | 23.0 | 2.030 |

Periodic MGGP | 11 | 0.329 | −23.5 | 0.369 | 12 | 1.186 | 36.1 | 0.665 |

Periodic GP | 2 | 0.373 | −36.4 | 0.390 | 3 | 1.921 | 52.3 | 0.850 |

Periodic DENFIS | 161 | 0.253 | 266 | 0.711 | 139 | 1.486 | 306 | 0.756 |

Model . | k . | Antakya . | . | Antalya . | ||||
---|---|---|---|---|---|---|---|---|

. | MSE . | AIC . | CA . | k . | MSE . | AIC . | CA . | |

MGGP | 14 | 0.454 | −4.40 | 0.415 | 9 | 5.031 | 133 | 1.380 |

GP | 2 | 0.415 | −32.1 | 0.408 | 3 | 1.888 | 51.1 | 0.840 |

DENFIS | 138 | 0.440 | 242 | 0.414 | 148 | 1.353 | 317 | 0.714 |

CHS | 3 | 8.964 | 45.1 | 2.180 | 3 | 1.742 | 23.0 | 2.030 |

Periodic MGGP | 11 | 0.329 | −23.5 | 0.369 | 12 | 1.186 | 36.1 | 0.665 |

Periodic GP | 2 | 0.373 | −36.4 | 0.390 | 3 | 1.921 | 52.3 | 0.850 |

Periodic DENFIS | 161 | 0.253 | 266 | 0.711 | 139 | 1.486 | 306 | 0.756 |

## CONCLUSION

The study compared the ability of MGGP and DENFIS in modeling pan evaporation and compared them with GP and CHS equation. The monthly maximum temperature, minimum temperature, solar radiation, wind speed, RH, and pan evaporation data from Antakya and Antalya stations, Mediterranean Region of Turkey were used in the applications. The influence of periodicity component on the models' prediction accuracy was also examined in the study. Involving periodicity in the inputs considerably improved the accuracy of DENFIS and MGGP models in Antakya and Antalya stations, respectively. The DENFIS model with periodic input performed superior to the periodic MGGP and GP models in Antakya station, while the periodic MGGP model provided better accuracy than the periodic DENFIS and GP models at Antalya station. Periodic DENFIS model decreased the RMSE of the periodic GP model from 0.611 mm to 0.503 mm at Antakya station. For Antalya, the RMSE of the periodic GP model was decreased from 1.386 mm to 1.089 mm using periodic MGGP. The models' complexity was also investigated by using AIC and it was seen that the DENFIS model has a highly complex structure and high number of parameters. The main advantage of the MGGP in addition to its high accuracy is that it has a very simple structure, and therefore, it could be easily utilized in practical applications. The CHS model provided the worst accuracy in both stations with respect to a new criterion which combines the RMSE, MAE, and R^{2} statistics. The obtained results in the two stations are different to each other, especially in terms of the CHS method. The reason for this may be the difference in statistical characteristics of the two stations. From these results it can be said that the models' accuracies cannot be generalized (cannot be extended to other study cases) and require more comparison using different data from different regions for justifying their generalization. The MGGP and DENFIS models may be incorporated as modules in general hydrological analysis models.

## REFERENCES

*Irrigation and Drainage Paper No. 56*

*.*