Evaporation is a basic element in the hydrological cycle that plays a vital role in a region's water balance. In this paper, the Wild Horse Optimizer (WHO) algorithm was used to optimize long short-term memory (LSTM) and support vector regression (SVR) to estimate daily pan evaporation (Ep). Primary meteorological variables including minimum temperature (Tmin), maximum temperature (Tmax), sunshine hours (SSH), relative humidity (RH), and wind speed (WS) were collected from two synoptic meteorological stations with different climates which are situated in Fars province, Iran. One of the stations is located in Larestan city with a hot desert climate and the other is in Abadeh city with a cold dry climate. The partial mutual information (PMI) algorithm was utilized to identify the efficient input variables (EIVs) on Ep. The results of the PMI algorithm proved that the Tmax, Tmin, and RH for Larestan station and also the Tmax, Tmin, and SSH for Abadeh station are the EIVs on Ep. The results showed the LSTM–WHO hybrid model for both stations can ameliorate the daily Ep estimation and it can also reduce the estimation error. Therefore, the LSTM–WHO hybrid model was proposed as a powerful model compared to standalone models in estimating daily Ep.

  • Optimize long short-term memory (LSTM) and support vector regression (SVR) by the Wild Horse Optimizer (WHO) algorithm to estimate daily pan evaporation.

  • Using the partial mutual information (PMI) algorithm for recognition of the efficient input variables on pan evaporation.

  • The LSTM–WHO hybrid model was proposed as a powerful model compared to standalone models in estimating daily pan evaporation.

Evaporation is a phenomenon in which water turns into vapor (Majhi et al. 2020). It is the first way of returning water to the hydrological cycle (Liu et al. 2004; Landeras et al. 2008). Evaporation is an important part of the hydrological cycle which affects the region's water balance, hydrological modeling, irrigation system design, and agricultural production (Gundalia & Dholakia 2013; Moazenzadeh et al. 2018; Guan et al. 2020; Mohamadi et al. 2020; Allawi et al. 2021). Millions of cubic meters of water are lost annually due to the process of evaporation and plenty of salt and solute which leads to water quality reduction (Havens & Ji 2018). As the shortage of water resources is always a serious problem in the world, the accurate estimation of evaporation is considered as one of the main processes of water loss which has been emphasized by researchers (Samii et al. 2023). The accurate estimation of evaporation takes a leading role in calculating water balance, designing irrigation systems, managing water resources and sustainable development (Piri et al. 2016; Dou & Yang 2018; Kumar et al. 2021). Therefore, the accurate estimation of evaporation in dry regions with a shortage of water resources seems necessary and it provides useful information in drought conditions (Malik et al. 2020a; Seifi & Riahi 2020).

Generally, evaporation is measured in direct and indirect ways. The pans are considered as one of the most common direct methods used for evaporation measurement in regions (Warnaka & Pochop 1998; Wang & Dickinson 2012; Ehteram et al. 2022). Using pans for direct measurement of evaporation is not possible in regions without meteorological stations where the establishment of the station requires costs and facilities. Instead, empirical formulas and analytical methods are used for evaporation estimation (Wang et al. 2019). Relative humidity, air temperature, wind speed, solar radiation, atmospheric pressure, dissolved solute, and latitude and longitude all have an impact on evaporation and should be taken in mind when we want to estimate the evaporation (Alizadeh et al. 2017; Tao et al. 2018).

Basically, experimental methods belong to a specific geographical region and they require specific data which is impossible in some cases (Khosravi et al. 2019). On the other hand, it is practically impossible to model evaporation through experimental methods due to the physical complexity, non-linear nature and many affecting elements on it, (Singh & Xu 1997). Nowadays regarding the limitations, new methods have been invented to estimate evaporation. In recent decades, artificial intelligence (AI) technique as a new approach, has been successfully applied to water resources studies (Diop et al. 2018; Qasem et al. 2019; Ali et al. 2020; Malik et al. 2020b; Tur & Yontem 2021).

This technique has demonstrated its ability to predict many hydrological variables such as streamflow (Adnan et al. 2021a), drought (Parisouj et al. 2020), rainfall (Salih et al. 2020; Adnan et al. 2021b), groundwater (Rahman et al. 2020; Mosavi et al. 2021) and evapotranspiration (Alizamir et al. 2020; Makwana et al. 2023). The main reason for using this technique is the high ability to establish non-linear relationships between input and output variables. In the first phase, this technique receives data as input in different forms such as numbers, speeches, texts, images etc. Consequently, it processes data through various algorithms. After the processing phase, the AI technique provides an output for the input. Finally, the result will be evaluated through analysis, discovery and feedback. This loop continues until the desired result is attained. Neural networks, machine learning, cognitive computing, deep learning, computer vision and natural language processing can be considered as main components of AI. AI technique has been accepted by researchers due to the high-speed data processing. Although the usage of the AI technique has been providing accurate and fast estimation of hydrological variables, its accuracy highly relies on the user's knowledge and the perception of AI (Razavizadeh & Dargahian 2018).

Many algorithms such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Differential Evolution (DE), Bat Algorithm (BA), Gray Wolf Optimizer (GWO), Harris Hawks Optimization (HHO), Giza Pyramids Construction (GPC) etc. made hybrid with AI technique in order to augment the capabilities of it. These algorithms are classified as meta-heuristic algorithms which are used for optimization. They have become popular in recent years due to their higher efficiencies, optimal value estimation of model parameters and high accuracy of estimation. Ehteram et al. (2022) used an ANN model as well as several algorithms such as Firefly Algorithm (FFA), Capuchin Search Algorithm (CSA), GA, and Sine Cosine Algorithm (SCA) to predict how much water would evaporate each day for seven synoptic stations in Iran with different climates. Then, they used the Inclusive Multiple Model (IMM) for forecasting the evaporation. Guan et al. (2020) used the SVR model combined with the Krill Herd Algorithm (SVR-KHA) to estimate daily Ep in three Iranian meteorological stations with humid climates. Various combinations of meteorological input variables including Tmax, RH, rainfall, SSH, WS, solar radiation, and vapor pressure were considered for Ep modeling. Feng et al. (2018) suggested an Extreme Learning Machine (ELM), optimized ANN models with PSO algorithm (ANN–PSO) and GA algorithm (ANN–GA) for estimating Ep in different climate regions of China. Bhattarai et al. (2023) used LSTM and Gaussian process regression (GPR) models for forecasting monthly Ep at two meteorological stations, one with a tropical climate in Hialeah, Florida and the other with a Mediterranean climate in Markley Cove, California.

The innovative aspect of this study is to use of the new WHO algorithm for the optimization of LSTM and SVR parameters with the goal of improving the estimation of daily Ep in different climates.

Study area

Fars province is one of the southern provinces of Iran, which is situated at latitude 27°1′ to 31°42′ N and longitude 50°34′ to 55°44′ E and covers an area of 122,272 km2. This province with 37 cities has diverse climates due to the uneven distribution of precipitation. This province contains of 67.4% range, 20.4% forest and 12.2% desert. It has a mean temperature of 18.9 °C, annual rainfall of 286.8 mm and annual evaporation of 2,553.4 mm. The cities of Larestan, in the south of Fars province, and Abadeh in the north were selected as the case study. Larestan city is located in 351 km from the center of the province. This city is spread between latitude 27°19′ to 28°11′ N and longitude 53°22′ to 55°44′ E with an area of 10,740 km2. The mean temperature, annual rainfall, and annual evaporation of this city are respectively, 23.9 °C, 213 mm, and 3,307 mm. According to the Emberger climate classification system, Larestan city has a hot desert climate.

Abadeh city is located in the northernmost part of Fars province and extends between latitude of 30°47′ to 31°42′ N and longitude of 51°50′ to 53°13′ E. This city has an area of 6,670 km2 and is located in 285 km from the center of Fars province. It has a mean temperature of 14.6 °C, an average rainfall of 139.7 mm, and an annual evaporation of 2,410 mm. According to the Emberger climate classification system, Abadeh city has a cold dry climate. Figure 1 illustrates the region and the location of the studied synoptic meteorological stations.
Figure 1

Map of the region and the location of the studied meteorological stations.

Figure 1

Map of the region and the location of the studied meteorological stations.

Close modal

Data collection and preparation

For modeling the daily Ep of both cities, the data of synoptic meteorological stations of these cities were collected on a daily scale. Larestan meteorological station is located at latitude 27°40′12″ N, longitude 54°22′29″ E and an altitude of 792 m. In such manner, Abadeh meteorological station is situated at latitude 31°11′54″ N, longitude 52°36′42″ E and the altitude of 2030 m. The distance between the two stations is 611 km. The collected data included Ep, SSH, Tmax, Tmin, WS, and RH. In this study, 3,654 data sets and 2,483 data sets were used respectively for Larestan and Abadeh stations during the years 2012–2022. Due to the missing data in Abadeh station for some years, the number of data is less than that in Larestan station. Table 1 shows the statistical characteristics of the data used in both meteorological stations. The variables of Tmin, Tmax, RH, WS, SSH were considered as primary input variables and the daily Ep considered as the output variable. The PMI algorithm was used to identify the EIVs on Ep among the five primary input variables in both stations.

Table 1

Statistical characteristics of the data used in the meteorological stations

StationStatistical criteriaEp (mm)Tmin (°C)Tmax (°C)SSH (h)RH (%)WS (m/s)
Larestan Min 0.2 −4.2 6.8 
Max 22.3 32.4 47.4 12.9 96.5 14.5 
Mean 8.94 16.21 32.67 9.67 40.75 1.51 
Median 8.8 16.6 33.8 10.2 37 1.38 
SD 4.79 8.23 9.02 2.44 15.86 0.97 
CV% 53.57 50.77 27.6 25.21 38.93 64.41 
Abadeh Min 0.3 −6.6 4.2 6.5 
Max 18.2 29.7 39.3 12.7 87.5 10.67 
Mean 8.87 11.5 27.41 9.67 26.78 3.06 
Median 9.1 12.1 28.6 10.4 21.5 2.88 
SD 3.7 5.8 6.63 2.59 14.82 1.28 
CV% 41.68 50.46 24.19 26.77 55.33 41.75 
StationStatistical criteriaEp (mm)Tmin (°C)Tmax (°C)SSH (h)RH (%)WS (m/s)
Larestan Min 0.2 −4.2 6.8 
Max 22.3 32.4 47.4 12.9 96.5 14.5 
Mean 8.94 16.21 32.67 9.67 40.75 1.51 
Median 8.8 16.6 33.8 10.2 37 1.38 
SD 4.79 8.23 9.02 2.44 15.86 0.97 
CV% 53.57 50.77 27.6 25.21 38.93 64.41 
Abadeh Min 0.3 −6.6 4.2 6.5 
Max 18.2 29.7 39.3 12.7 87.5 10.67 
Mean 8.87 11.5 27.41 9.67 26.78 3.06 
Median 9.1 12.1 28.6 10.4 21.5 2.88 
SD 3.7 5.8 6.63 2.59 14.82 1.28 
CV% 41.68 50.46 24.19 26.77 55.33 41.75 

Identification and estimation of PMI algorithm

Among the non-linear algorithms for choosing EIVs for models based on data processing is the PMI algorithm. Sharma (2000) created the PMI-based input selection (PMIS) algorithm that is being presented here in order to find EIVs in hydrological models. Every iteration of the PMI algorithm looks at two variables: an input (C) and an output (Y). Then, it finds the value of PMI by considering the output variable by maximizing Cs (assuming that Cs varies from C). The confidence bounds derived from the distribution created by a bootstrap loop serve as the foundation for the statistical idea that PMI estimates for Cs. Cs is added to S (the set of chosen input variables), and the method is stopped when there are no more significant inputs. At that point, the input is deemed significant.

An observation y that is considered to be a member of Y, which is characterized by Shannon entropy (Shannon 1948), is not quite clear. On the other hand, mutual observations allow one to infer the value of y from x and vice versa, therefore reducing the uncertainty associated with speculating about a random input variable, X, on which Y depends. As per the notion of mutual information (MI) I (X, Y), the observation of X leads to a decrease in the uncertainty of variable Y (Cover & Thomas 1991). The issue at hand is depicted in Figure 2 through the use of a Venn diagram. This figure shows, as a common area between two circles, the relationship between MI and entropy for the output Y and the single input variable X (May et al. 2008). When the conditional entropies H(X|Y) and H(Y|X), respectively, specify the reduced uncertainty around X and Y, the joint area is seen. Equation (1) can be used to compute the MI (May et al. 2008).
(1)
where the joint probability density function (PDF) is denoted by , and the marginal probability density functions (PDFs) of X and Y are denoted by and , respectively. In practice, it is thought to be uncertain what the proper form of PDFs in Equation (1). Probability density estimation is thus utilized in its place. By combining the probability density estimates with the numerical approximation of the integral in Equation (1), Equation (2) is obtained (May et al. 2008).
(2)
Figure 2

Venn diagram depicting the correlation between entropy and MI for output Y and a single variable X for input.

Figure 2

Venn diagram depicting the correlation between entropy and MI for output Y and a single variable X for input.

Close modal

In which n is a sample of n observations from (x, y), and f is the density derived from those observations. Equation (2) indicates that the technique used to estimate the marginal and joint PDFs has a significant impact on the precise and efficient estimation of MI. The Akaike information criterion (AIC), which is detailed below, is used to halt the PMI method.

Akaike information criterion

In order to formulate this termination criterion, Akaike invented the AIC in 1974 as a measure of the trade-off between the size of the input set S and the accuracy of the regression filter. Model selection frequently uses metrics such as the AIC as a foundation for comparison. The equation following is the expression for AIC:
(3)
where p is the number of model parameters, n is the number of observations, and stands for n residuals. The term ‘‘ in linear regression is equivalent to ‘,’ where ‘‘ is the number of variables.

Development of models

For optimizing SVR and LSTM models, the WHO algorithm provided by Naruei & Keynia (2022) was used to estimate daily Ep. Two new hybrid models were obtained with the combination of WHO with SVR and LSTM which were then named SVR–WHO and LSTM–WHO, respectively. Daily Ep was estimated and compared using two standalone models in the studied stations. 80% of the data was considered as the training phase and 20% of the data was considered as the testing phase (Majhi et al. 2020; Alsumaiei 2020). The flowchart of the work steps has been shown in Figure 3.
Figure 3

Flowchart of research steps.

Figure 3

Flowchart of research steps.

Close modal

Long short-term memory model

Recurrent neural networks (RNNs) have been enhanced to include long-term dependency learning or LSTM. A primary issue with RNN networks is their short-term memory, which eventually causes gradients to explode and vanish (Bengio et al. 1994). However, the LSTM model has found a solution to this issue. The cell state is the primary component of the LSTM network. The LSTM can insert new data or extract existing data from the cell state. Gate-like structures will carry out this process. These gates are thought of as the information's input channel. They are made up of a point-to-point multiplication operator and a sigmoid function. A number between 0 and 1 that represents the proportion of input that should be delivered to the output phase is the sigmoid function's output. When the value is 1, all input should be transmitted to the output; when the value is 0, no information should be given to the output.

To regulate the cell state value, an LSTM has three gates: an input, an output, and a forget gate. The forget gate determines the appropriate amount of data to be deleted from the cell state. The sigmoid function makes this determination. For every number in the cell state, the sigmoid function will transmit a value between 0 and 1 to the output, according to the values of ht−1 and xt. The input gate regulates the cell's fresh information flow. This gate determines whether or not new data should be utilized at the given time step. How much, if at all? The information to be produced as an output is determined by the output gate. In essence, the cell state is defined by the output. The LSTM cell is schematically depicted in Figure 4, and the following equations update it at each time step t.
(4)
(5)
(6)
(7)
(8)
(9)
where , , and denote the cell state, hidden state, and input at time step t, respectively, while , , and stand for the forget, input, and output gate. Furthermore, the letters w and b stand for the weight matrices and bias vectors of each LSTM cell, respectively. In addition, Equations (10) and (11) are used to derive and tanh, which are observed as the hyperbolic and sigmoid tangent functions, respectively. The values of LSTM parameters are displayed in Table 2. Figure 4 also shows a schematic diagram of LSTM cell.
(10)
(11)
Table 2

Values of used parameters in the LSTM model

ParameterValue
Bach size 50 
Activation function ReLU 
Learning rate 0.01 
Dropout rate 0.2 
Iterations 500 
Network weights optimizer Adam 
ParameterValue
Bach size 50 
Activation function ReLU 
Learning rate 0.01 
Dropout rate 0.2 
Iterations 500 
Network weights optimizer Adam 
Figure 4

Schematic diagram of LSTM cell.

Figure 4

Schematic diagram of LSTM cell.

Close modal

Support vector regression model

Support vector regression (SVR) is a SVM regression model. SVM is a data-driven algorithm for prediction, regression, and classification that is used for supervised learning techniques. This model functions as a statistical learning theory, which finds a generic optimal solution by applying the structural risk minimization (SRM) principle. To put it another way, SVR is a model that fits a curve with a ɛ-deviation for the data so that the test data has the least amount of error (Al-Mukhtar 2019). Estimating the weight and bias parameters of the function that best fits the data is the aim of SVR (Yu & Kim 2012). The equation below defines the SVR regression function:
(12)
where are the predictors and target variables, respectively.

is the weight vector, b is the bias and is the non-linear transform function or kernel function.

The regression problem can be expressed as below Equations (13) and (14) for minimizing the loss function to have a suitable SVR function f(X):
(13)
(14)
where , , and represent Euclidean smooth vector, penalty parameter, and value estimated by the model, while and indicate the slack variables, respectively.
Using Lagrangian multipliers, a non-linear regression function can be written as Equation (15):
(15)
where is the kernel function, and are Lagrangian multipliers .
Kernel functions are utilized to solve non-linear problems in the SVR model. As a result, selecting the suitable kernel function is crucial to obtaining the best solution. SVR model kernel functions include linear, polynomial, sigmoid, and radial basic functions (RBF). In this study, the different kernel functions were evaluated in which, RBF proved the best performance. Therefore, it was selected for daily Ep modeling, which is denoted by Equation (16).
(16)
where γ is the Gaussian RBF kernel parameter width. The values of the parameters utilized in SVR model are shown in Table 3.
Table 3

Values of the parameters used in the SVR model

ParameterValue
Penalty parameter 10 
Kernel function RBF 
Allowable error 0.02 
ParameterValue
Penalty parameter 10 
Kernel function RBF 
Allowable error 0.02 

WHO algorithm

Naruei and Keynia devised WHO in 2022. The social behavior and life of wild horses inspired this algorithm. Horses typically live in herds with a stallion and multiple mares and foals. They exhibit many social behaviors, such as grazing, pursuing, dominance, leadership, and mating. Foals, for example, leave the group before attaining maturity and join other groups. Another significant behavior is horse grazing in the herd around the stallion or group leader. The WHO algorithm, like other algorithms, starts with the formation of an initial random population. To begin, the primary population is classified into numerous groups. If the starting population contains N individuals, the number of groups equals:
(17)

denotes the percentage of stallions in the population and is used as a control parameter.

As a result, we have a leader for each group based on the number of groups, and the remaining members will be distributed equally in groups. They are chosen based on their physical fitness. The group's leaders should direct their members to the water hole, and they should use it if they dominate. Otherwise, if another group takes control, they should abandon it. This algorithm's major performance is based on wild horse mating. Meanwhile, members of the same family cannot mate, and when they reach adulthood, they must leave the group and join another to find their partner. Based on this criterion, the goal function evaluates regularly and alters the population to achieve the optimal result. The WHO method was constructed using Equations (18)–(23), as shown below:
(18)
where and are, respectively, new position and current position of the group member while grazing, is position of group leader, R is random number between −2 and 2, is equal to 3.14 and Z adaptive mechanism obtained from Equation (19):
(19)
where P contains a vector of 0 and 1, and are random vectors with range of 0 and 1, is a random number with range of 0 and 1, indexes of satisfy the condition (P = =0). is an adaptive parameter that calculated by Equation (20). This parameter begins with 1 and at the end of the implementation of the algorithm attains 0.
(20)
where and are, respectively, the current iteration and maximum number of iterations of the algorithm.
(21)
where is the position of horse p from group k, is the position of the foal q from group i that mates with the horse z with the position , which leaves group j.
(22)
(23)
where is the next position of the leader of the i group, is the position of the water hole, is the present position of the leader of the i group and Z, R and formerly presented through Equation (18). The mentioned algorithm was coded in the MATLAB software with the parameters shown in Table 6. The values of WHO parameters (Table 4) were obtained from sensitivity analysis.
Table 4

WHO algorithm regulatory parameters

ParameterValue
Stallions' percentage 0.2 
dimensions 30 
Crossover percentage 0.13 
Number of horses 100 
Iterations number 500 
ParameterValue
Stallions' percentage 0.2 
dimensions 30 
Crossover percentage 0.13 
Number of horses 100 
Iterations number 500 
Table 5

Gena values and ranking of models based on the method of El Bilali et al. (2022) 

Rank of modelValue of GenA
Perfect 
Excellent  
Good  
Poor and unsuitable for simulation purposes  
Rank of modelValue of GenA
Perfect 
Excellent  
Good  
Poor and unsuitable for simulation purposes  
Table 6

Pearson correlation coefficient values between the main meteorological parameters measured in the Larestan station

ParametersTminTmaxRHSSHWSEp
Tmin      
Tmax 0.923**     
RH −0.522** −0.729**    
SSH 0.316** 0.539** −0.650**   
WS 0.368** 0.237** −0.108** −0.009  
Ep 0.864** 0.910** 0.705** 0.524** 0.289** 
ParametersTminTmaxRHSSHWSEp
Tmin      
Tmax 0.923**     
RH −0.522** −0.729**    
SSH 0.316** 0.539** −0.650**   
WS 0.368** 0.237** −0.108** −0.009  
Ep 0.864** 0.910** 0.705** 0.524** 0.289** 

**Correlation is significant at the 0.01 level.

Data normalization

One of the stages of data preprocessing is their normalization, which can increase the efficiency and accuracy of intelligent models. Entering data in raw form reduces the speed and accuracy of the model. Therefore, normalizing the data when the range of their changes is large will cause to train the model better and faster to increase the accuracy. In this study, the data were normalized through the Equation (24) between 0 and 1 (Shaukat et al. 2022).
(24)
where and indicate the normalized and observational data, while and represent the maximum and minimum data, respectively.

Model evaluation criteria

The accuracy of the developed models for estimating daily Ep was evaluated by using the criteria of RMSE, NSE and R2. The NSE stands for the efficiency of the model, which can take values from in which, the number 1 indicates a good adjustment between the observed and simulated values. RMSE shows the difference between the estimated value of the model with the observed one. The range of RMSE varies from . The closer the value to zero is the better the accuracy of the model will be. The R2 is one of the evaluation criteria of the model which shows the power of predicting the dependent variable based on the independent variable. The value of R2 varies between 0 and 1. These criteria are calculated through Equations (25)–(27).
(25)
(26)
(27)
where n is the number of data, is measured value of Ep, is the calculated Ep by the model, is the average of measured Ep and is average of calculated Ep by the model.

Generalization ability (GenA)

One of the important challenges in the development of models, especially for data-driven ones, is to guarantee the GenA of models (Yoon et al. 2011; Chen et al. 2020). If the model fully simulates the wanted phenomenon, the GenA value is unity. But if the model is over trained, the GenA value will reach exceed unity. For GenA value less than unity, indicates that the model is under trained (Yoon et al. 2011). The GenA is obtained through Equation (28) (Yoon et al. 2011):
(28)

El Bilali et al. (2022) ranked the models for simulation purposes in four categories as perfect, excellent, good and poor which has been displayed in Table 5.

In this study, the new WHO algorithm was applied to train two SVR and LSTM models for estimating daily Ep. For this purpose, the daily data of two meteorological stations were used during the years 2012 to 2022 in Fars province, Iran. One of the stations is situated in Larstan City with a hot desert climate and the other is Abadeh City with a cold dry climate. The data were gathered from two stations including Tmin, Tmax, RH, WS and SSH as primary input variables of the models and daily Ep was considered as the output of the models.

Tables 6 and 7 display Pearson correlation coefficient values between the main meteorological parameters measured in the Larestan and Abadeh stations, respectively. These tables show a strong correlation (say > 0.6) between daily EP and Tmax, Tmin and RH for the two studied stations. This correlation is negative for RH while positive for Tmax and Tmin.

Table 7

Pearson correlation coefficient values between the main meteorological parameters measured in the Abadeh station

ParametersTminTmaxRHSSHWSEp
Tmin      
Tmax 0.875*     
RH −0.526** −0.764**    
SSH 0.314** 0.558** −0.647**   
WS 0.146** −0.011 −0.028 0.049*  
Ep 0.788** 0.813** 0.650** 0.522** 0.174** 
ParametersTminTmaxRHSSHWSEp
Tmin      
Tmax 0.875*     
RH −0.526** −0.764**    
SSH 0.314** 0.558** −0.647**   
WS 0.146** −0.011 −0.028 0.049*  
Ep 0.788** 0.813** 0.650** 0.522** 0.174** 

**Correlation is significant at the 0.01 level.

*Correlation is significant at the 0.05 level.

The PMI algorithm was utilized to identify the EIVs on daily Ep. The results based on the AIC as the stopping condition of the PMI algorithm are shown in Table 8. In Table 8, iteration, variable, I(x;y), MC–I*(95), MC–I*(99), and AIC indicate the number of repetitions of the PMI algorithm, name of the variable, value of PMI for each variable, 95% range of the critical value of MI, 99% range of the critical value of MI, and numerous value of AIC for each variable, respectively.

Table 8

PMI input algorithm results on evaporation data sets in meteorological stations

StationIterationVariableI(x;y)MC–I*(95)MC–I*(99)AIC
Larestan Tmax 0.878 0.024 0.026 6,685.1 
RH 0.051 0.024 0.026 6,799.8 
Tmin 0.050 0.024 0.026 7,027.3 
SSH 0.028 0.024 0.026 −6,986.5 
LogWS 0.016 0.024 0.026 −6,538.6 
Abadeh Tmax 0.566 0.030 0.033 2,690.9 
Tmin 0.072 0.030 0.033 2,830.1 
SSH 0.096 0.030 0.033 3,089.9 
logRH 0.036 0.030 0.033 −2,997.7 
LogWS 0.027 0.030 0.033 −2,593.2 
StationIterationVariableI(x;y)MC–I*(95)MC–I*(99)AIC
Larestan Tmax 0.878 0.024 0.026 6,685.1 
RH 0.051 0.024 0.026 6,799.8 
Tmin 0.050 0.024 0.026 7,027.3 
SSH 0.028 0.024 0.026 −6,986.5 
LogWS 0.016 0.024 0.026 −6,538.6 
Abadeh Tmax 0.566 0.030 0.033 2,690.9 
Tmin 0.072 0.030 0.033 2,830.1 
SSH 0.096 0.030 0.033 3,089.9 
logRH 0.036 0.030 0.033 −2,997.7 
LogWS 0.027 0.030 0.033 −2,593.2 

According to the PMI algorithm, EIVs are those whose AIC value indicates a decreasing trend. The AIC values of Tmax, RH, and Tmin variables in the Larestan station display a decreasing trend (Table 8). This point is also true for Abadeh station with Tmax, Tmin, and SSH variables. Therefore, the EIVs on daily Ep in Larestan station are Tmax, RH and Tmin and for Abadeh station are Tmax, Tmin and SSH. It is significant to point out that the usage of the PMI algorithm leads to reducing the time needed for recognition of the EIVs.

The values of RMSE, NSE, and R2 for models taken during the training and testing phases are shown in Table 9 for both stations. The LSTM–WHO hybrid model in both stations has the most accuracy and performance for estimating daily Ep in the training and testing phases. LSTM–WHO hybrid model for Larestan station gives RMSE (1.151/1.099 mm), NSE (0.941/0.950), R2 (0.943/0.952) and for Abadeh station gives RMSE (1.135/1.163 mm), NSE (0.904/0.896), R2 (0.912/0.902) for the training/testing phases (see bolded values in Table 9). Also, the LSTM standalone model has poor accuracy for daily Ep estimation. The LSTM model in Larestan station shows RMSE (1.950/1.797 mm), NSE (0.832/0.865) and R2 (0.840/0.825) and in Abadeh station gives RMSE (1.843/2.114 mm), NSE (0.748/0.658), R2 (0.770/0.681) for the training/testing phases.

Table 9

Evaluation criteria values of developed models for training and testing phases for both stations

StationModelTraining
Testing
RMSE (mm)NSER2RMSE (mm)NSER2
Larestan SVR 1.952 0.831 0.848 1.702 0.879 0.905 
LSTM 1.950 0.832 0.840 1.797 0.865 0.825 
SVR–WHO 1.876 0.844 0.846 1.474 0.923 0.926 
LSTM–WHO 1.151 0.941 0.943 1.099 0.950 0.951 
Abadeh SVR 1.769 0.767 0.768 1.890 0.725 0.748 
LSTM 1.843 0.748 0.770 2.114 0.658 0.681 
SVR–WHO 1.390 0.857 0.873 1.419 0.846 0.857 
LSTM–WHO 1.135 0.904 0.912 1.163 0.897 0.902 
StationModelTraining
Testing
RMSE (mm)NSER2RMSE (mm)NSER2
Larestan SVR 1.952 0.831 0.848 1.702 0.879 0.905 
LSTM 1.950 0.832 0.840 1.797 0.865 0.825 
SVR–WHO 1.876 0.844 0.846 1.474 0.923 0.926 
LSTM–WHO 1.151 0.941 0.943 1.099 0.950 0.951 
Abadeh SVR 1.769 0.767 0.768 1.890 0.725 0.748 
LSTM 1.843 0.748 0.770 2.114 0.658 0.681 
SVR–WHO 1.390 0.857 0.873 1.419 0.846 0.857 
LSTM–WHO 1.135 0.904 0.912 1.163 0.897 0.902 

Furthermore, the measured and estimated daily Ep values have been compared through time-series plots (left side) and scatter plots (right side) during the testing phase for both stations (Figures 5 and 6). As Figures 5 and 6 (left side) reveal the LSTM–WHO and SVR–WHO hybrid models estimate daily Ep with high accuracy rather than the LSTM and SVR standalone models for both stations, especially the maximum and minimum values of daily Ep. However, the accuracy of LSTM–WHO is higher with a bit of difference than SVR–WHO. Besides, the comparison of the scatter plots for the studied models in Figures 5 and 6 (right side) for the LSTM–WHO model shows that the regression line and the 1:1 line are very close to each other with R2 = 0.952 for Larestan station and R2 = 0.902 for Abadeh station.
Figure 5

Time-series (left) and scatter plots (right) of measured and estimated daily Ep for models in the Larestan station.

Figure 5

Time-series (left) and scatter plots (right) of measured and estimated daily Ep for models in the Larestan station.

Close modal
Figure 6

Time-series (left) and scatter plots (right) of measured and estimated daily Ep for models in the Abadeh station.

Figure 6

Time-series (left) and scatter plots (right) of measured and estimated daily Ep for models in the Abadeh station.

Close modal

The scatter plots of Figures 5 and 6 (right side) reveal that the accuracy of the SVR–WHO hybrid model for estimating maximum and minimum values of daily Ep is not as good as the LSTM–WHO hybrid model. However, its performance is acceptable compared to LSTM and SVR standalone models. The most distance between the regression line and the 1:1 line belongs to the LSTM model for two stations. The evaluation criteria also confirm it.

In terms of graphic evaluation, the violin plot of the studied models has been drawn for the testing phase which is shown in Figure 7. The violin plot shows the distribution of data. Referring to Figure 7, the shape of the violin plot of the LSTM–WHO hybrid model in both stations is more similar to the violin plot of the measured values in the testing phase. This result reflects the estimated daily Ep values conform well with the measured values. It shows a good ability of the LSTM–WHO hybrid model for estimating the daily Ep in Larestan and Abadeh stations. Also, the violin plot of the LSTM model has the most difference in the distribution of the estimated values compared to the measured values in the testing phase in the two stations.
Figure 7

Violin plot of estimated versus measured of daily Ep by different models: (a) Larestan station and (b) Abadeh station.

Figure 7

Violin plot of estimated versus measured of daily Ep by different models: (a) Larestan station and (b) Abadeh station.

Close modal

These results show the high power of the WHO algorithm by finding optimal values for parameters of standalone models which appeared for the LSTM–WHO hybrid model. These results show that standalone models are easier and faster but, in some cases, they are unable to model the complicated processes. Qian et al. (2020) believe that standalone models compared to hybrid ones, have less ability to couple and process the non-linear problems. They stated that hybrid models can be used as an efficient method to solve complicated non-linear problems. This research indicated that although the LSTM model is an efficient model for solving complicated and non-linear problems its performance is weak compared to hybrid models. The same fact is true for the SVR model. Training this model with the WHO algorithm augments its efficiency for estimating daily Ep values. Therefore, the WHO algorithm has been able to increase the accuracy of SVR and LSTM models by finding optimal values for the parameters of models.

Table 10 shows a literature review on daily Ep modeling by using different AI methods. The comparison between the evaluation criteria of the developed LSTM–WHO (Table 8) with the evaluation criteria of the models suggested by different researchers (Table 10) shows that the LSTM–WHO model has high accuracy and good performance compared to other models for daily EP estimation.

Table 10

Literature review on daily EP modeling by AI methods

Previous studiesModelsSuggested modelInput variablesR2NSE
Keshtegar et al. (2019)  MLPNN, RSM, SVR-RSM SVR-RSM Tmax, Tmin, H%, U2, α 0.912 0.910 
Majhi et al. (2020)  MLANN, LSTM LSTM Tmax, Tmin, RHI, RHII, WS, BSS 0.800 0.773 
Seifi & Soroush (2020)  ANN, ANN–GA, ANN-WOA, ANN-GWO ANN–GA Tmin, Tmax, RH, n, U2 0.790 0.780 
Malik et al. (2021)  VR-WOA, SVR-SHO, SVR-SSA, SVR-PSO, SVR-MVO, and PM SVR-SSA Tmin, Tmax, RHmax, RHmin, Us, Rs 0.905 0.815 
Kushwaha et al. (2021)  SVM, RT, REPTree, RSS SVM Tmin, RH, SSH, WS 0.874 0.865 
Malik et al. (2023)  SVR-GA, SVR-GWO SVR-GWO Tmin, Tmax, RHmax, RHmin, Us, R 0.790 0.787 
Previous studiesModelsSuggested modelInput variablesR2NSE
Keshtegar et al. (2019)  MLPNN, RSM, SVR-RSM SVR-RSM Tmax, Tmin, H%, U2, α 0.912 0.910 
Majhi et al. (2020)  MLANN, LSTM LSTM Tmax, Tmin, RHI, RHII, WS, BSS 0.800 0.773 
Seifi & Soroush (2020)  ANN, ANN–GA, ANN-WOA, ANN-GWO ANN–GA Tmin, Tmax, RH, n, U2 0.790 0.780 
Malik et al. (2021)  VR-WOA, SVR-SHO, SVR-SSA, SVR-PSO, SVR-MVO, and PM SVR-SSA Tmin, Tmax, RHmax, RHmin, Us, Rs 0.905 0.815 
Kushwaha et al. (2021)  SVM, RT, REPTree, RSS SVM Tmin, RH, SSH, WS 0.874 0.865 
Malik et al. (2023)  SVR-GA, SVR-GWO SVR-GWO Tmin, Tmax, RHmax, RHmin, Us, R 0.790 0.787 

Gena and uncertainties

The GenA of the models in this study based on the classification method of El Bilali et al. (2022) is shown in Table 11. Also, in this table, the logarithmic values of the confidence intervals with a significance level of 95%, and the values of the upper limit (UL) and lower limit (LL) have been calculated for the studied models. All the developed models in both stations of Larestan and Abadeh have an excellent GenA. These values show that the models are well trained and the simulation process is well trained on the input data.

Table 11

Logarithmic values of the upper limit, lower limit and GenA of the developed models in the both stations

StationModelUL95%LL95%GenARank
Larestan SVR 2.12 2.04 0.871 Excellent 
LSTM 2.17 2.10 0.921 Excellent 
SVR–WHO 2.15 2.08 0.785 Excellent 
LSTM–WHO 2.13 2.05 0.955 Excellent 
Abadeh SVR 2.15 2.07 1.068 Excellent 
LSTM 2.15 2.09 1.147 Excellent 
SVR–WHO 1.99 1.88 1.021 Excellent 
LSTM–WHO 2.03 1.92 1.025 Excellent 
StationModelUL95%LL95%GenARank
Larestan SVR 2.12 2.04 0.871 Excellent 
LSTM 2.17 2.10 0.921 Excellent 
SVR–WHO 2.15 2.08 0.785 Excellent 
LSTM–WHO 2.13 2.05 0.955 Excellent 
Abadeh SVR 2.15 2.07 1.068 Excellent 
LSTM 2.15 2.09 1.147 Excellent 
SVR–WHO 1.99 1.88 1.021 Excellent 
LSTM–WHO 2.03 1.92 1.025 Excellent 

In addition, a graphical comparison was made which can be seen in Figure 8. In this comparison, the logarithmic values of observed and estimated Ep, LL 95% and UL 95% for the LSTM–WHO hybrid model related to the two stations were drawn. Figure 8 confirms that the mentioned model has successfully simulated daily Ep values in both stations since the estimated values are within the confidence intervals with a significance level of 95%.
Figure 8

Drawing the 95% confidence intervals for LSTM–WHO hybrid model: (a) Larestan station and (b) Abadeh station.

Figure 8

Drawing the 95% confidence intervals for LSTM–WHO hybrid model: (a) Larestan station and (b) Abadeh station.

Close modal

One of the most important climatic and hydrological variables is evaporation. The accurate estimation of evaporation especially in arid and semi-arid areas that encounter with shortage of water resources, is necessary for water management and agricultural activities. In this study for the first time, the WHO algorithm was examined to optimize SVR and LSTM models in two synoptic meteorological stations in Fars province, Iran. One of the stations is situated in Larestan city with a hot desert climate and the other is Abadeh city with cold dry climate. The accuracy of four models such as SVR, LSTM, SVR–WHO and LSTM–WHO for daily EP modeling was examined. PMI algorithm used to identify EIVs on daily Ep. The accuracy of the developed models was evaluated based on the RMSE, NSE, and R2 statistical criteria, the graphic criteria of violin plot and confidence intervals. Moreover, the GenA of the developed models for both stations were checked and its classification was also done. PMI algorithm results showed that Tmax, RH and Tmin variables and Tmax, Tmin and SSH variables are EIVs on daily EP in Larestan and Abadeh stations, respectively. The results of this study showed a positive application of the WHO algorithm in improving the accuracy of individual models for estimating the daily Ep. It is concluded that the LSTM–WHO hybrid model has the best performance in daily Ep estimation for both stations based on statistical criteria. Therefore, the LSTM–WHO hybrid model will be chosen as the superior model. In addition, the comparison of the violin plot of the LSTM–WHO hybrid model with the violin plot of the measured values confirms the results of the statistical criteria. The results of drawing confidence intervals for the selected model revealed that the estimated EP values are within the confidence intervals with a significance level of 95%. Also, the results of the GenA for the developed models showed that they have an excellent GenA which approved their good training of models in both stations. It should be noted that GenA of the LSTM–WHO hybrid model should be carried out in the studied area within a short distance with the same weather conditions. Therefore, the necessary actions e.g., calibration and re-evaluation should be taken in case the model will be used in other areas with different weather conditions. The limitation of this study was non-continuity in the time-series of data because of the missing data during some years. In the future, it will be suggested to use other new optimizer algorithms to optimize the parameters related to deep learning and learning machine models for estimating Ep and compare the results with the current study.

The authors would like to express their appreciation to the meteorological bureau of Fars Province for providing the information.

M.S. and H.F. wrote the main manuscript text and M.A.A. generated the code of the used models. All authors reviewed the manuscript.

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

All authors certify that they have ethical conduct required by the journal.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Adnan
R. M.
,
Mostafa
R. R.
,
Kisi
O.
,
Yaseen
Z. M.
,
Shahid
S.
&
Zounemat-Kermani
M.
2021a
Improving streamflow prediction using a new hybrid ELM model combined with hybrid particle swarm optimization and grey wolf optimization
.
Knowledge-Based Systems
230
,
107379
.
Adnan
R. M.
,
Petroselli
A.
,
Heddam
S.
,
Santos
C. A.
&
Kisi
O.
2021b
Comparison of different methodologies for rainfall–runoff modeling: Machine learning vs conceptual approach
.
Natural Hazards
105
,
2987
3011
.
Alizadeh
M. J.
,
Kavianpour
M. R.
,
Kisi
O.
&
Nourani
V.
2017
A new approach for simulating and forecasting the rainfall-runoff process within the next two months
.
Journal of Hydrology
548
,
588
597
.
Alizamir
M.
,
Kisi
O.
,
Muhammad Adnan
R.
&
Kuriqi
A.
2020
Modelling reference evapotranspiration by combining neuro-fuzzy and evolutionary strategies
.
Acta Geophysica
68
,
1113
1126
.
Allawi
M. F.
,
Ahmed
M. L.
,
Aidan
I. A.
,
Deo
R. C.
&
El-Shafie
A.
2021
Developing reservoir evaporation predictive model for successful dam management
.
Stochastic Environmental Research and Risk Assessment
35
,
499
514
.
Bengio
Y.
,
Simard
P.
&
Frasconi
P.
1994
Learning long-term dependencies with gradient descent is difficult
.
IEEE Transactions on Neural Networks
5
(
2
),
157
166
.
Bhattarai
A.
,
Qadir
D.
,
Sunusi
A. M.
,
Getachew
B.
&
Mallah
A. R.
2023
Dynamic sliding window-based long short-Term memory model development for pan evaporation forecasting
.
Knowledge-Based Engineering and Sciences
4
(
1
),
37
54
.
Cover
T. M.
&
Thomas
J. A.
1991
Elements of Information Theory
.
John Wiley & Sons, Inc.
,
New York
, p.
776
.
Diop
L.
,
Bodian
A.
,
Djaman
K.
,
Yaseen
Z. M.
,
Deo
R. C.
,
El-Shafie
A.
&
Brown
L. C.
2018
The influence of climatic inputs on stream-flow pattern forecasting: Case study of Upper Senegal River
.
Environmental Earth Sciences
77
,
1
13
.
Dou
X.
&
Yang
Y.
2018
Modeling evapotranspiration response to climatic forcings using data-driven techniques in grassland ecosystems
.
Advances in Meteorology
2018 (1), 1–18.
Ehteram
M.
,
Panahi
F.
,
Ahmed
A. N.
,
Mosavi
A. H.
&
El-Shafie
A.
2022
Inclusive multiple models using hybrid artificial neural networks for predicting evaporation
.
Frontiers in Environmental Science
9
,
789995
.
El Bilali
A.
,
Moukhliss
M.
,
Taleb
A.
,
Nafii
A.
,
Alabjah
B.
,
Brouziyne
Y.
,
Mazigh
N.
,
Teznine
K.
&
Mhamed
M.
2022
Predicting daily pore water pressure in embankment dam: Empowering Machine Learning-based modeling
.
Environmental Science and Pollution Research
29
(
31
),
47382
47398
.
Feng
Y.
,
Jia
Y.
,
Zhang
Q.
,
Gong
D.
&
Cui
N.
2018
National-scale assessment of pan evaporation models across different climatic zones of China
.
Journal of Hydrology
564
,
314
328
.
Guan
Y.
,
Mohammadi
B.
,
Pham
Q. B.
,
Adarsh
S.
,
Balkhair
K. S.
,
Rahman
K. U.
,
Linh
N. T.
&
Tri
D. Q.
2020
A novel approach for predicting daily pan evaporation in the coastal regions of Iran using support vector regression coupled with krill herd algorithm model
.
Theoretical and Applied Climatology
142
,
349
367
.
Gundalia
M. J.
&
Dholakia
M. B.
2013
Estimation of pan evaporation using mean air temperature and radiation for monsoon season in Junagadh region
.
Int. J. Eng. Res
3 (6),
64
70
.
Keshtegar
B.
,
Heddam
S.
,
Sebbar
A.
,
Zhu
S. P.
&
Trung
N. T.
2019
SVR-RSM: A hybrid heuristic method for modeling monthly pan evaporation
.
Environ Sci Pollut Res Int.
26
(
35
),
35807
35826
.
doi:10.1007/s11356-019-06596-8
.
Khosravi
K.
,
Daggupati
P.
,
Alami
M. T.
,
Awadh
S. M.
,
Ghareb
M. I.
,
Panahi
M.
,
Pham
B. T.
,
Rezaie
F.
,
Qi
C.
&
Yaseen
Z. M.
2019
Meteorological data mining and hybrid data-intelligence models for reference evaporation simulation: a case study in Iraq
.
Computers and Electronics in Agriculture
167
,
105041
.
Kumar
M.
,
Kumari
A.
,
Kumar
D.
,
Al-Ansari
N.
,
Ali
R.
,
Kumar
R.
,
Kumar
A.
,
Elbeltagi
A.
&
Kuriqi
A.
2021
The superiority of data-driven techniques for estimation of daily pan evaporation
.
Atmosphere
12
(
6
),
701
.
Kushwaha
N. L.
,
Rajput
J.
,
Elbeltagi
A.
,
Elnaggar
A. Y.
,
Sena
D. R.
,
Vishwakarma
D. K.
,
Mani
I.
&
Hussein
E. E.
2021
Data intelligence model and meta-heuristic algorithms-based pan evaporation modelling in two different agro-climatic zones: A case study from northern India
.
Atmosphere
12
,
1654
.
doi:10.3390/atmos12121654
.
Liu
B.
,
Xu
M.
,
Henderson
M.
&
Gong
W.
2004
A spatial analysis of pan evaporation trends in China, 1955–2000
.
Journal of Geophysical Research: Atmospheres
109
(
D15
), 1–9.
Majhi
B.
,
Naidu
D.
,
Mishra
A. P.
&
Satapathy
S. C.
2020
Improved prediction of daily pan evaporation using Deep-LSTM model
.
Neural Computing and Applications
32
,
7823
7838
.
Malik
A.
,
Rai
P.
,
Heddam
S.
,
Kisi
O.
,
Sharafati
A.
,
Salih
S. Q.
,
Al-Ansari
N.
&
Yaseen
Z. M.
2020a
Pan evaporation estimation in Uttarakhand and Uttar Pradesh States, India: Validity of an integrative data intelligence model
.
Atmosphere
11
(
6
),
553
.
Malik
A.
,
Kumar
A.
,
Kim
S.
,
Kashani
M. H.
,
Karimi
V.
,
Sharafati
A.
,
Ghorbani
M. A.
,
Al-Ansari
N.
,
Salih
S. Q.
,
Yaseen
Z. M.
&
Chau
K. W.
2020b
Modeling monthly pan evaporation process over the Indian central Himalayas: Application of multiple learning artificial intelligence model
.
Engineering Applications of Computational Fluid Mechanics
14
(
1
),
323
338
.
Malik
A.
,
Tikhamarine
Y.
,
Al-Ansari
N.
,
Shahid
S.
,
Sekhon
H. S.
,
Pal
R.
,
Rai
P.
,
Pandey
K.
,
Singh
P.
,
Elbeltagi
A.
&
Shauket Sammen
S.
2021
Daily pan-evaporation estimation in different agro-climatic zones using novel hybrid support vector regression optimized by Salp swarm algorithm in conjunction with gamma test
.
Engineering Applications of Computational Fluid Mechanics
15
(
1
),
1075
1094
.
doi:10.1080/19942060.2021.1942990
.
Malik
A.
,
Tikhamarine
Y.
,
Souag-Gamane
D.
,
Sammen
S. S.
&
Kisi
O.
2023
Support vector regression model optimized with GWO versus GA algorithms: Estimating daily pan-evaporation
. In:
Handbook of Hydroinformatics
, pp.
357
373
.
doi:10.1016/B978-0-12-821961-4.00001-4
.
May
R. J.
,
Maier
H. R.
,
Dandy
G. C.
&
Fernando
T. G.
2008
Non-linear variable selection for artificial neural networks using partial mutual information
.
Environmental Modelling & Software
23
(
10–11
),
1312
1326
.
Moazenzadeh
R.
,
Mohammadi
B.
,
Shamshirband
S.
&
Chau
K. W.
2018
Coupling a firefly algorithm with support vector regression to predict evaporation in northern Iran
.
Engineering Applications of Computational Fluid Mechanics
12
(
1
),
584
597
.
Mohamadi
S.
,
Ehteram
M.
&
El-Shafie
A.
2020
Accuracy enhancement for monthly evaporation predicting model utilizing evolutionary machine learning methods
.
International Journal of Environmental Science and Technology
17
,
3373
3396
.
Mosavi
A.
,
Sajedi Hosseini
F.
,
Choubin
B.
,
Taromideh
F.
,
Ghodsi
M.
,
Nazari
B.
&
Dineva
A. A.
2021
Susceptibility mapping of groundwater salinity using machine learning models
.
Environmental Science and Pollution Research
28,
10804
10817
.
Naruei
I.
&
Keynia
F.
2022
Wild horse optimizer: A new meta-heuristic algorithm for solving engineering optimization problems
.
Engineering with Computers
38
(
4
),
3025
3056
.
Qasem
S. N.
,
Samadianfard
S.
,
Kheshtgar
S.
,
Jarhan
S.
,
Kisi
O.
,
Shamshirband
S.
&
Chau
K. W.
2019
Modeling monthly pan evaporation using wavelet support vector regression and wavelet artificial neural networks in arid and humid climates
.
Engineering Applications of Computational Fluid Mechanics
13
(
1
),
177
187
.
Qian
L.
,
Liu
C.
,
Yi
J.
&
Liu
S.
2020
Application of hybrid algorithm of bionic heuristic and machine learning in nonlinear sequence
. In
Journal of Physics: Conference Series
,
2020 Nov 1
, Vol.
1682
(
1
).
IOP Publishing
, p.
012009
.
Rahman
A. S.
,
Hosono
T.
,
Quilty
J. M.
,
Das
J.
&
Basak
A.
2020
Multiscale groundwater level forecasting: coupling new machine learning approaches with wavelet transforms
.
Advances in Water Resources
141
,
103595
.
Razavizadeh
S.
&
Dargahian
F.
2018
Optimization of artificial neural network structure in prediction of sediment discharge using taguchi method
.
Iranian Journal of Watershed Management Science and Engineering
12
(
43
),
89
97
.
Salih
S. Q.
,
Sharafati
A.
,
Ebtehaj
I.
,
Sanikhani
H.
,
Siddique
R.
,
Deo
R. C.
,
Bonakdari
H.
,
Shahid
S.
&
Yaseen
Z. M.
2020
Integrative stochastic model standardization with genetic algorithm for rainfall pattern forecasting in tropical and semi-arid environments
.
Hydrological Sciences Journal
65
(
7
),
1145
1157
.
Samii
A.
,
Karami
H.
,
Ghazvinian
H.
,
Safari
A.
&
Ajirlou
Y. D.
2023
Comparison of DEEP-LSTM and MLP models in estimation of evaporation pan for arid regions
.
Journal of Soft Computing in Civil Engineering
7
(
2
), 155–175.
Shannon
C. E.
1948
A mathematical theory of communication
.
Bell System Technical Journal
27
(
3
),
379
423
.
Shaukat
N.
,
Hashmi
A.
,
Abid
M.
,
Aslam
M. N.
,
Hassan
S.
,
Sarwar
M. K.
,
Masood
A.
,
Shahid
M. L.
,
Zainab
A.
&
Tariq
M. A.
2022
Sediment load forecasting of Gobindsagar reservoir using machine learning techniques
.
Frontiers in Earth Science
10
,
1047290
.
Tao
H.
,
Diop
L.
,
Bodian
A.
,
Djaman
K.
,
Ndiaye
P. M.
&
Yaseen
Z. M.
2018
Reference evapotranspiration prediction using hybridized fuzzy model with firefly algorithm: Regional case study in Burkina Faso
.
Agricultural Water Management
208
,
140
151
.
Tur
R.
&
Yontem
S.
2021
A comparison of soft computing methods for the prediction of wave height parameters
.
Knowledge-Based Engineering and Sciences
2
(
1
),
31
46
.
Wang
K.
,
Liu
X.
,
Tian
W.
,
Li
Y.
,
Liang
K.
,
Liu
C.
,
Li
Y.
&
Yang
X.
2019
Pan coefficient sensitivity to environment variables across China
.
Journal of Hydrology
572
,
582
591
.
Warnaka
K.
&
Pochop
L.
1998
Analyses of equations for free water evaporation estimates
.
Water Resources Research
24
(
7
),
979
984
.
Yoon
H.
,
Jun
S. C.
,
Hyun
Y.
,
Bae
G. O.
&
Lee
K. K.
2011
A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer
.
Journal of Hydrology
396
(
1–2
),
128
138
.
Yu
H.
&
Kim
S.
2012
SVM Tutorial-Classification, Regression and Ranking
. In:
Handbook of Natural Computing
, G. Rozenberg, T. Bäck & J. N. Kok (ed.), Springer, pp.
479
506
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).