Water resources are one of the most important features of the environment to meet human needs. In the current research, morphological, quantitative and qualitative hydrological, and land use factors as well as the combined factor, which is the combination of effective variables of the aforementioned factors, have been used to estimate River Water Withdrawal (RWW) for agricultural uses. Lavasanat and Qazvin are selected as study areas, located in the Namak Lake basin in Iran, with Bsk and Csa climate categories, respectively. Estimation of RWW is performed using single and Wavelet–hybrid (W-hybrid) data-driven methods, including Artificial Neural Networks (ANNs), Wavelet–ANN (WANNs), Adaptive Neuro-Fuzzy Inference System (ANFIS), Wavelet–ANFIS (WANFIS), Gene Expression Programming (GEP), and Wavelet–GEP (WGEP). Due to the evaluation criteria, the performance of the WGEP model is the best among the others for estimating RWW variables in both study areas. Considering the W-hybrid models with data de-noising for estimating RWW in the Lavasanat and Qazvin study areas, the obtained values of RMSE for WGEP11 to WGEP15 and WGEP21 to WGEP25 equal 67.268, 54.659, 80.871, 50.796, 15.676 and 105.532, 96.615, 105.018, 160.961, 44.332, respectively. The results indicate that WGEP and ANN are the best and poorest models in both study areas without regarding climate condition effects. Also, a combined factor which includes River Width (RW), minimum flow rate (QMin), average flow rate (QMean), Electrical Conductivity (EC), and Cultivated Area (CA) variables is introduced as the best factor to estimate RWW variables compared with the other factors in both the Bsk and Csa climate categories. On the other hand, qualitative hydrological and land use factors were the weakest ones to estimate RWW variables in the Bsk and Csa climate categories, respectively. Therefore, the current study explores how the mathematical relations for estimating RWW have a significant effect on water resources management and planning by policymakers in the future.

  • River Water Withdrawal (RWW) for agricultural purposes was estimated using data-driven methods.

  • The impact of climatic condition, river morphology, quantitative and qualitative hydrological characteristics, and land use features on RWW estimation was assessed.

  • De-noising the data and developing the combined factor could improve the model's performance.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Climate change and restrictions on water resources' accessibility affect the lives of human beings and all living organisms; and they pose serious challenges for humanity in different aspects (Jeihouni et al. 2019; Vanderhoof et al. 2019; Bissenbayeva et al. 2021). Various sectors of agriculture, industry, environment, etc. depend on water resources to meet the needs of communities. The agricultural sector, in addition to having a significant share in satisfying food needs, is the main sector of water consumption due to irrigation demand to produce agricultural products. Therefore, optimal management and planning in this sector can have a significant impact on saving more water resources for future generations (Vondracek et al. 2005; Mehta et al. 2013, 2014; Böhme et al. 2016; Ghalehkhondabi et al. 2017; Patel et al. 2018; Langat et al. 2019; Kumar & Mehta 2020). Low flows and droughts are also important for agricultural water use (Eris et al. 2019; Eris et al. 2020).

The amount of River Water Withdrawal (RWW) for agricultural consumption can be evaluated from various perspectives, including morphological, quantitative and qualitative hydrological, and land use factors (Msigwa et al. 2019; Magritsky et al. 2020). Due to the dynamic nature of rivers, the morphological factor is constantly changing and these changes can negatively affect the quantitative and qualitative status of rivers (Gostner et al. 2013) and agricultural lands (Hohensinner et al. 2018). Population growth, accompanied by increasing land use changes, and improper uses of surface and groundwater resources have not only reduced water resource quantities, but have also destroyed the quality of these valuable resources. Investigating the impact of land use on river water withdrawal can enhance the management and planning of water resources, which requires both a holistic view and engineering precision (Shank & Stauffer 2015; Msigwa et al. 2019; Shirmohammadi et al. 2020). The arid and semi-arid climatic conditions in the majority of basins in Iran highlights the limitations of water resources and the importance of proper management (Abbaspour et al. 2009; Sharafati et al. 2020).

In previous decades, the importance of surface water resources has led to the evaluation of various quantitative and qualitative hydrological variables of river flow rate, Total Dissolved Solids (TDS), and water temperature, etc. by various numerical methods (Mehta et al. 2013, 2014; Montaseri et al. 2018; Mehta & Yadav 2020; Mehta et al. 2020). Recently, hybrid data-driven evolutionary methods such as Wavelet–AutoRegressive (WAR) model, Wavelet–AutoRegressive Moving Average (WARMA), Wavelet–Linear Regression (WLR), Wavelet–Artificial Neural Networks (WANNs), Wavelet–Particle Swarm Optimization (WPSO), Wavelet–Adaptive Neural Fuzzy Inference System (WANFIS), Wavelet–Support Vector Machine (WSVM), and Wavelet–Gene Expression Programming (WGEP) have been extended and emphasized by many researchers and politicians for modeling, optimization, and management of the world's water resources (Yarar 2014; Barzegar et al. 2016; Montaseri et al. 2018; Zhang et al. 2018). In the estimation and modeling field of quantitative and qualitative characteristics of river water, some summaries of research are addressed. Adamowski & Sun (2010) extended ANN and WANN methods for flow estimating of the Kargotis and Xeros rivers, located in Cyprus with semi-arid climate conditions. In both rivers, the WANN models provide more real flow estimates than the ANN models. Barzegar et al. (2016) applied single and Wavelet–hybrid (W-hybrid) data-driven methods, including ANN, ANFIS, WANN, and WANFIS, to estimate the Electrical Conductivity (EC) value for Aji-Chay River, located in a cold semi-arid region in Iran. Respectively, the results reflected the superiority of Daubechies-4 (Db4) mother wavelet decomposition compared with the other wavelets, W-hybrid models compared with the single models, and the WANFIS model compared with the WANN model. The ability of evolutionary data-driven methods (e.g. GEP, Support Vector Machine (SVM), WGEP, and WSVM) has been investigated by Solgi et al. (2017) for estimating the daily and monthly flow rates of Gamasiyab River, located within a cold semi-arid basin in Iran. The results reflected the higher efficiency of W-hybrid methods compared with the single ones. Shafaei & Kisi (2017) predicted the daily flow of the Aji-Chay River, located in a cold semi-arid basin in Iran, applying SVM, ANN, and WANN models. The results reflected the superior performance of the WANN methods. Zhang et al. (2018) estimated four-station streamflow of the East River basin, located in a subtropical basin in China, through Multiple Linear Regression (MLR), ANN, and WANN methods. Based on the results, the WANN model performed well in comparison with the MLR and ANN models. Yaseen et al. (2018) estimated monthly Tigris river flow in Iraq, employing simple and W-hybrid data-driven evolution, as a new structure of ANN, which is called Extreme Learning Machine (ELM and WELM) methods. The results revealed that WELM models can be introduced as a reliable application for estimating river flow in semi-arid climatic conditions. Montaseri et al. (2018) used single and W-hybrid data-driven methods, including ANN, ANFIS, GEP, WANN, WANFIS, and WGEP, to estimate the amount of TDS in rivers of four basins with various climatic conditions (e.g. Dsa, Bsk, Bwk, and Bsh Köppen–Geiger climate categories), located in Iran. The results indicated the superior efficiency of W-hybrid methods compared with single methods. It also highlighted the importance of providing mathematical relationships derived from GEP and WGEP. Sun et al. (2019) predicted short-term flow rates of the Heihe and Pearl rivers in China, classified in arid-semi-arid and humid subtropical climate categories, in turn, using evolutionary data-driven methods (e.g. AR, ARMA, ANN, LR, WAR, WARMA, WANN, and WLR). Based on the results, they declared that W-hybrid models have higher performance than single models. Kumar et al. (2020) extended data-driven evolutionary methods including WANN and SVM to estimate the rate of perennial river discharge, located in India. The results indicated the appropriate and acceptable performance of evolutionary data-driven methods to estimate daily river discharge. Wang et al. (2020) modeled three water quality indicators of the Grand Canal, located in China, including Chemical Oxygen Demand (COD), Ammonia Nitrogen (NH3-N), and Dissolved Oxygen (DO), using a single and hybrid form of W-PSO-SVR methods. The results indicated better performance of coupled wavelet models for estimating water quality indicators in tropical monsoon climatic conditions. Chen et al. (2020) and Rajaee et al. (2020) investigated single and W-hybrid data-driven methods to estimate river water qualitative variables. Their evaluations emphasize the applicability of the data-driven methods in the estimation of water quality indicators for rivers and the superiority of W-hybrid methods compared with the single data-driven methods for various climatic conditions. Kambalimath & Deka (2021) estimated daily stream flow in Malaprabha sub-catchment located in India via SVM and W-SVM models. Their results indicated that combining wavelet models with single SVM could improve the efficiency of daily stream flow estimating. Since in recent years, due to climate change, the Namak Lake basin has been facing a shortage of water resources, it is especially important to pay attention to the amount of water withdrawal for agricultural uses (Sheikh et al. 2020).

Despite the great improvement in the data-driven methods' application in hydrological-variable estimation, most of the previous studies focused on modeling the quantitative and qualitative characteristics of river flows and evaluating the elementary level of the hybrid models without estimating RWW, which is covered by the current research. The Namak Lake basin has an arid climate condition. As agriculture is affected by climate change and is the main consumer of water resources, scientists and policymakers should pay special attention to satisfying this sector to achieve sustainability. Consequently, estimating the amount of river water withdrawal for agricultural uses requires extreme attention. This study is conducted to assess three single data-driven models (ANN, ANFIS, and GEP) and their W-hybrid models (WANN, WANFIS, and WGEP) for RWW estimation in two study areas classified in various climate categories, while data de-noising, and evaluation of various factors (e. g. morphological, quantitative and qualitative hydrological, land use, and combined factors) is carried out. The novelty of the developed W-hybrid model's structure and estimated RWW variables with various factors and related variables, by collecting data with field sampling in the Lavasanat and Qazvin study areas, make current research distinguished from the previous literature. Besides, considering various climate zones provides a better comparison of the model's performance.

The Namak Lake basin with an area of 92,563 km2 is located in the central part of Iran, which is one of the most critical basins in Iran. In the current research, the Lavasanat and Qazvin study areas were selected to evaluate river water withdrawal for agricultural uses (Figure 1). According to the Köppen–Geiger climate classification, the Lavasanat study area is categorized as arid desert cold (Bsk), and the Qazvin study area as a warm temperate with dry, hot summer (Csa) climate type. The Bsk is a climate in which average annual temperature is less than 18 °C and is too dry to support a forest, but not dry enough to be a desert, usually consisting of grassland plains. The Csa is a climate in which the coldest month is warmer than −3 °C but colder than +18 °C and summers are dry and hot. In selected study areas, field sampling was performed to determine the amount of river water withdrawal for agricultural uses, and data was collected by the Iran Water Resources Management Company (http://wrbs.wrm.ir/). The number and spatial distribution of samples are acceptable based on the significant results of the F-test at the level of 0.01 (sig = 0.000), applying SPSS software (IBM Corp., released 2017). The number of samples in the Lavasanat and Qazvin study areas is 291 and 198, respectively. The sampled variables included River Width (RW), River Depth (RD), minimum flow rate (QMin), maximum flow rate (QMax), average flow rate (QMean), Water Temperature (WT), Electrical Conductivity (EC), pH, Cultivated Area (CA), and Orchard Area (OA).

Figure 1

Study areas of Namak Lake basin.

Figure 1

Study areas of Namak Lake basin.

Close modal

RW and RD represent the morphological features, QMin, QMax, QMean, WT, EC, and pH represent quantitative and qualitative hydrological characteristics, CA and OA signify land use features, and RW, QMin, QMean, EC, and CA are regarded as combined factors. The statistical characteristics of the data for the selected study areas are given in Table 1. The RWW variable has average values of 182.08 and 154.59 (×103 m3) in the Lavasanat and Qazvin study areas, respectively.

Table 1

Affective factors on RWW, related variables and statistical characteristics

Study areas (i = 1, 2)FactorsMorphologic
Quantitative hydrologic
Qualitative hydrologic
Land use
RWW
VariablesRWRDQMinQMaxQMeanWTECpHCAOA
Unitsmmm3/sm3/sm3/shaha×103 m3
Lavasanat (i = 1) Mean 0.49 0.09 11.95 25.29 18.37 15.03 444.66 8.37 10.46 10.23 182.08 
0.20 0.07 12.52 24.10 17.89 3.33 239.94 0.22 9.92 9.62 169.06 
Cv 0.41 0.74 1.05 0.95 0.97 0.22 0.54 0.03 0.95 0.94 0.93 
Qazvin (i = 2) Mean 0.51 0.21 5.19 16.73 9.83 18.85 564.57 7.66 7.43 5.83 154.59 
0.26 0.16 10.13 30.47 18.43 3.07 273.02 0.48 9.40 6.28 236.78 
Cv 0.50 0.79 1.95 1.82 1.87 0.16 0.48 0.06 1.27 1.08 1.53 
Study areas (i = 1, 2)FactorsMorphologic
Quantitative hydrologic
Qualitative hydrologic
Land use
RWW
VariablesRWRDQMinQMaxQMeanWTECpHCAOA
Unitsmmm3/sm3/sm3/shaha×103 m3
Lavasanat (i = 1) Mean 0.49 0.09 11.95 25.29 18.37 15.03 444.66 8.37 10.46 10.23 182.08 
0.20 0.07 12.52 24.10 17.89 3.33 239.94 0.22 9.92 9.62 169.06 
Cv 0.41 0.74 1.05 0.95 0.97 0.22 0.54 0.03 0.95 0.94 0.93 
Qazvin (i = 2) Mean 0.51 0.21 5.19 16.73 9.83 18.85 564.57 7.66 7.43 5.83 154.59 
0.26 0.16 10.13 30.47 18.43 3.07 273.02 0.48 9.40 6.28 236.78 
Cv 0.50 0.79 1.95 1.82 1.87 0.16 0.48 0.06 1.27 1.08 1.53 

Wavelet transform

Wavelet transformation is a mathematical tool applied to extract information from various categories of data. Thus, the wavelet can well decompose the low-frequency dynamics of unstable time series and is also strong in displaying noise, periodic patterns, and variance change (Montaseri et al. 2018). Wavelets have some slight benefits over Fourier transforms in reducing computations when examining specific frequencies. It can be addressed as follows:
(1)
where v and z reflect the integer m variable of scaling and translation function; w reflects an integer variable that refers to a point of the input signal; n indicates the discrete time index; x(n) reflects a given signal and f(w) represents the mother wavelet (Barzegar et al. 2016). One of the most widely used wavelets is that constructed by Daubechies. These wavelets are orthonormal, compactly supported; they have the maximum number of vanishing moments for the support, and are reasonably smooth (Manian & Vásquez 1998). The low-pass and band-pass filter coefficients satisfy the following conditions of orthogonality, normality, and regularity:
(2)
(3)
(4)
The functions in the wavelet can separate the data into various frequency values. Each component of the data is evaluated by a resolution appropriate to its scale. The lower scales express details of the signal which changes rapidly, while the higher scales show the slow frequency changes of the components (Adamowski & Sun 2010). The number of decomposition levels, named L, was determined by Equation (5) (Montaseri et al. 2018):
(5)
where N is the number of data. In the current study, the level of wavelet decomposition is 2 and 1 (with the data sampling number of 291 and 198) for the Lavasanat and Qazvin study areas, respectively. Decomposition and approximation in the wavelet, using the one-dimensional Daubechies wavelet (db4), is presented in two parts: (1) Al=1, 2 and (2) Dl=1, 2. An approximation or main of the signal is represented with A2 and A1 (low-frequency section), and D2 and D1 (high-frequency section) represent details.

For more details see Mallat (1989).

ANNs

Artificial neural networks (ANNs) are computational tools with a similar function to biological brain processes. The ANNs can be defined as a network of simple processors (or neurons) with three main layers including input, hidden, and output layers (Ferreira 2001). The mathematical form of neural networks is much simpler than their biological appearance (Rahmanpanah et al. 2020). In ANN methodology, the dataset is often separated into two main sections: training and testing. The training and testing sections contain the data for the model's learning to determine the weight of neurons and evaluate generalization performance, respectively. The training section should contain appropriate input data and ultimately the desired results. The selection of input variables is one of the main aspects of building a successful neural model in which keeping the cost of data collection down, eliminating the effects of duplicate data, and providing the model in the simplest possible way are prioritized. In modeling ANNs, determining input variables and collecting appropriate training data require more time and effort than network training. The special purpose of ANN modeling is to introduce the interaction between input and output variables (Montaseri et al. 2018; Chaplot & Birbal 2021). The Levenberg–Marquardt (LM) algorithm is one of the fastest ANN training algorithms, the structure of which is designed to achieve high training speed without the need to calculate the Hussein Matrix (HM) (Ghavidel & Montaseri 2014).

In our research, an ANN, containing a three-layer structure (including: input, hidden, and output layers), LM Back-Propagation Training Algorithm (BPTA), various transfer functions in the hidden and output layers, and a varied number of neurons (1–10) in the hidden layer, has been coded (in MATLAB software, 2018) and developed to model the amount of RWW for agricultural uses.

ANFIS

The Adaptive Neural-Fuzzy Inference System (ANFIS) could be applied in examining nonlinear phenomena and evaluating the input–output relationship in multivariate systems (Takagi & Sugeno 1985). The ANFIS model is a hybrid of two neural networks and fuzzy models. The fuzzy section of the ANFIS model establishes a relation between the input and output variables, which is introduced as the Membership Function (MF). The fuzzy section extends a relation between input and output and then the variables related to the membership functions of the fuzzy sections are determined by ANNs. Therefore, the characteristics of both fuzzy and neural models lie in it (Barzegar et al. 2016). An ANFIS structure consisting of a set of Takagi–Sugeno-type fuzzy and IF-THEN rules can be used to model and map input–output data. In a fuzzy inference system, Sugeno's first-order fuzzy model, which comprises two IF-THEN fuzzy rules, is as follows:
(6)
(7)
where variables x and y are inputs and variable f is an output of the model, W1, W2, and Z1, Z2 denote membership values of input variables x and y, respectively; and p1, q1, r1, and p2, q2, r2 are the variables of the output functions f1 and f2, respectively.

In the current study, the ANFIS model based on the Subtractive Clustering (SC) method called ‘ANFIS-SC’ with a varied radii value (0–1) has been coded (in MATLAB software, 2018) and developed to model the amount of RWW for agricultural uses.

GEP

The GEP, similar to the genetic algorithm (GA), can combine linear and simple chromosomes of constant length. GEP models are generated based on the Darwinian theory of natural selection (Ferreira 2001; Birbal et al. 2021). The most effective indicators were extracted by the GEP model formulations. The GEP modeling of RWW was based on several indices and parameters, namely the Functions set (F), Terminal set (T), Mutation Rate (MR), Inversion Rate (IR), IS Transposition Rate (ISTR), RIS Transposition Rate (RISTR), One-Point Recombination Rate (OPRR), Two-Point Recombination Rate (TPRR), Gene Recombination Rate (GRR), Gene Transposition Rate (GTR), the Linking Function (LF), Fitness Function Error Type (FFET), and Penalizing Tool (PT). The GeneXpro Tools 4.0 default values which are used in this for MR, IR, ISTR, RISTR, OPRR, TPRR, GRR, and GTR are 0.044, 0.1, 0.1, 0.1, 0.3, 0.3, 0.1, and 0.1, respectively. The sub-trees were linked by an addition function. The FFET was RRSE. The parsimony pressure was selected as PT. The number of chromosomes, the number of genes per chromosome, and the head size of the chromosome are 30, 3, and 7, respectively. More details about the technical formulation of the GEP approach can be found in Ferreira (2006).

The unique ability of the wavelet method for estimating variables has led to the development of wavelet hybrids with other methods. In this study, the wavelet was hybridized with all three methods, ANN, GEP, and ANFIS, to estimate RWW for agricultural uses, and then, the estimated results were compared. Figure 2 shows the used single and W-hybrid approach (ANN, ANFIS, GEP, WANN, WANFIS, and WGEP) structures, and Figure 3 displays the stages of the current research.

Figure 2

Structure of applied models in the current research.

Figure 2

Structure of applied models in the current research.

Close modal
Figure 3

Stages of the current reseach.

Figure 3

Stages of the current reseach.

Close modal

Calculating model performance

The performance accuracy of the applied models in our research was evaluated with three criteria, namely correlation coefficient (R), Root Mean Square Error (RMSE), and Nash–Sutcliffe Efficiency (NSE) as follows:
(8)
(9)
(10)
where and are the observed and estimated RWW values for the ith time step; N is the total number of samples; and , are the means of the observed and estimated RWW values. The R values show the correlation between observed and estimated values. The RMSE value ranges from zero to large positive values (ZamanZad-Ghavidel et al. 2020). NSE efficiency can vary between large negative values and 1. An efficiency of 1 (NSE = 1) introduces the perfect match of estimated and real data. The training (70%) and testing (30%) ratio of the dataset was selected based on trial and error. In the current study the data were normalized using Equation (11):
(11)
where, XN, X, Xmin, and Xmax denote the normalized data, observed data, and minimum and maximum amount of observed data, respectively.

Wavelet decomposition analysis

The data was decomposed and divided into main (Al) and detail (Dl) sub series by applying one-dimensional Daubechies-4 (db4) mother wavelet analysis, which has been provided by many researchers, recently Shafaei & Kisi (2017) and Sun et al. (2019). The wavelet decomposed the variables of morphological, quantitative and qualitative hydrological, and land use factors including RW, RD, QMin, QMax, QMean, WT, EC, pH, CA, and OA, in the Lavasanat and Qazvin study areas at level 2 and 1, respectively. The obtained results of wavelet analysis are illustrated in Table 2. The Al (A1 and A2) series denote the main decomposed data and the Dl (D1 and D2) series denote the detailed decomposed or data noise. For instance, A2 and A1 with low-frequency approximation section at level 2 and 1 of the RWW variables for Lavasanat and Qazvin varied from −32.98 to +597.63 and −81.57 to +903.82, respectively.

Table 2

Results of wavelet analysis in the study areas

Study areas (i = 1, 2)Sub-seriesFactorsMorphologic
Quantitative hydrologic
Qualitative hydrologic
Land use
RWW
VariablesRWRDQMinQMaxQMeanWTECpHCAOA
Unitsmmm3/sm3/sm3/shaha×103 m3
Lavasanat (i = 1) A2 Min +0.19 +0.01 −2.69 −6.52 −4.25 +7.62 +75.33 +7.84 −3.59 −2.58 −32.98 
Max +0.95 +0.24 +42.46 +97.26 +72.30 +24.05 +1,141.40 +8.77 +32.66 +32.65 +597.63 
D2 Min −0.33 −0.21 −31.94 −51.84 −41.52 −3.28 −489.80 −0.23 −17.77 −16.90 −266.40 
Max +0.36 +0.22 +33.22 +56.06 +44.80 +2.95 +425.30 +0.26 +18.54 +17.65 +305.06 
D1 Min −0.31 −0.10 −20.06 −29.45 −24.37 −2.42 −278.60 −0.27 −15.44 −16.06 −218.47 
Max +0.39 +0.12 +19.34 +28.15 +23.48 +2.43 +280.28 +0.32 +16.13 +13.74 +223.06 
Qazvin (i = 2) A1 Min +0.21 −0.01 −6.23 −24.48 −12.12 +11.43 +250.15 +6.82 −2.45 −0.06 −81.57 
Max +1.21 +0.77 +50.89 +143.38 +99.04 +28.81 +1,457.00 +8.26 +31.47 +31.33 +903.82 
D1 Min −0.56 −0.27 −21.05 −92.27 −48.21 −7.03 −369.98 −0.65 −22.82 −13.02 −428.93 
Max +0.58 +0.31 +24.40 +84.29 +50.96 +7.20 +297.89 +0.65 +21.45 +15.31 +506.06 
Study areas (i = 1, 2)Sub-seriesFactorsMorphologic
Quantitative hydrologic
Qualitative hydrologic
Land use
RWW
VariablesRWRDQMinQMaxQMeanWTECpHCAOA
Unitsmmm3/sm3/sm3/shaha×103 m3
Lavasanat (i = 1) A2 Min +0.19 +0.01 −2.69 −6.52 −4.25 +7.62 +75.33 +7.84 −3.59 −2.58 −32.98 
Max +0.95 +0.24 +42.46 +97.26 +72.30 +24.05 +1,141.40 +8.77 +32.66 +32.65 +597.63 
D2 Min −0.33 −0.21 −31.94 −51.84 −41.52 −3.28 −489.80 −0.23 −17.77 −16.90 −266.40 
Max +0.36 +0.22 +33.22 +56.06 +44.80 +2.95 +425.30 +0.26 +18.54 +17.65 +305.06 
D1 Min −0.31 −0.10 −20.06 −29.45 −24.37 −2.42 −278.60 −0.27 −15.44 −16.06 −218.47 
Max +0.39 +0.12 +19.34 +28.15 +23.48 +2.43 +280.28 +0.32 +16.13 +13.74 +223.06 
Qazvin (i = 2) A1 Min +0.21 −0.01 −6.23 −24.48 −12.12 +11.43 +250.15 +6.82 −2.45 −0.06 −81.57 
Max +1.21 +0.77 +50.89 +143.38 +99.04 +28.81 +1,457.00 +8.26 +31.47 +31.33 +903.82 
D1 Min −0.56 −0.27 −21.05 −92.27 −48.21 −7.03 −369.98 −0.65 −22.82 −13.02 −428.93 
Max +0.58 +0.31 +24.40 +84.29 +50.96 +7.20 +297.89 +0.65 +21.45 +15.31 +506.06 

Single models for estimating RWW variable

In the current study, the three single data-driven methods ANNij, ANFISij, and GEPij (i = 1 and 2 show the number of selected study areas, and j = 1,.., 5 show the number of selected effective factors on the RWW variable) are applied for estimating the RWW variable of Lavasanat and Qazvin. The ANN–Levenberg–Marquardt (LM-BP) algorithm with one hidden layer is applied for RWW modeling. The hidden nodes' numbers of ANN (1–10) and radii values of the ANFIS (0.10–0.80) models are examined by the trial-and-error method. The results of optimal single models based on R, RMSE, and NSE values and characteristics for ANN (the number of neurons and activation functions of hidden and output layers), ANFIS (radii values), and GEP models are presented in Tables 3 and 4 for the Lavasanat and Qazvin study areas, respectively. The transfer functions of the output layer are evaluated purelin or tansig or logsig for the ANN developed models of various factors in both study areas with Bsk and Csa climates. For example, the transfer functions of the hidden layers of the ANN11 to ANN15 models were obtained tansig-purelin, tansig-tansig, tansig-purelin, logsig-purelin, and tansig-tansig for the Lavasanat study area.

Table 3

Results of applied models' performance in estimating RWW in the Lavasanat study area for the testing section

jFactorsModel typesModelsRRMSE (×103 m3)NSE
Morphologic Single ANN11 (tansig-purelin-3)a 0.731 108.479 0.503 
ANFIS11 (0.33)b 0.786 107.012 0.517 
GEP11 0.793 105.024 0.534 
W-Hybrid WANN11 (tansig-tansig-2) 0.806 71.973 0.609 
WANFIS11 (0.37) 0.815 71.323 0.616 
WGEP11 0.821 67.268 0.659 
Quantitative hydrologic Single ANN12 (tansig-tansig-3) 0.914 70.157 0.792 
ANFIS12 (0.47) 0.916 65.243 0.820 
GEP12 0.928 64.525 0.824 
W-Hybrid WANN12 (tansig-purelin-6) 0.921 60.530 0.724 
WANFIS12 (0.39) 0.930 56.735 0.757 
WGEP12 0.932 54.659 0.775 
Qualitative hydrologic Single ANN13 (tansig-purelin-3) 0.702 125.508 0.335 
ANFIS13 (0.35) 0.710 125.401 0.336 
GEP13 0.723 123.352 0.358 
W-Hybrid WANN13 (logsig-tansig-3) 0.773 93.470 0.341 
WANFIS13 (0.43) 0.799 88.150 0.414 
WGEP13 0.811 80.871 0.507 
Land use Single ANN14 (logsig-purelin-5) 0.882 77.849 0.744 
ANFIS14 (0.46) 0.888 74.340 0.767 
GEP14 0.899 69.824 0.794 
W-Hybrid WANN14 (tansig-purelin-4) 0.894 54.931 0.773 
WANFIS14 (0.37) 0.900 50.916 0.805 
WGEP14 0.919 50.796 0.805 
Combined Single ANN15 (tansig-tansig-3) 0.940 53.549 0.879 
ANFIS15 (0.31) 0.974 37.169 0.942 
GEP15 0.983 33.812 0.952 
W-Hybrid WANN15 (tansig-tansig-3) 0.986 20.615 0.968 
WANFIS15 (0.44) 0.991 16.497 0.979 
WGEP15 0.996 15.676 0.981 
jFactorsModel typesModelsRRMSE (×103 m3)NSE
Morphologic Single ANN11 (tansig-purelin-3)a 0.731 108.479 0.503 
ANFIS11 (0.33)b 0.786 107.012 0.517 
GEP11 0.793 105.024 0.534 
W-Hybrid WANN11 (tansig-tansig-2) 0.806 71.973 0.609 
WANFIS11 (0.37) 0.815 71.323 0.616 
WGEP11 0.821 67.268 0.659 
Quantitative hydrologic Single ANN12 (tansig-tansig-3) 0.914 70.157 0.792 
ANFIS12 (0.47) 0.916 65.243 0.820 
GEP12 0.928 64.525 0.824 
W-Hybrid WANN12 (tansig-purelin-6) 0.921 60.530 0.724 
WANFIS12 (0.39) 0.930 56.735 0.757 
WGEP12 0.932 54.659 0.775 
Qualitative hydrologic Single ANN13 (tansig-purelin-3) 0.702 125.508 0.335 
ANFIS13 (0.35) 0.710 125.401 0.336 
GEP13 0.723 123.352 0.358 
W-Hybrid WANN13 (logsig-tansig-3) 0.773 93.470 0.341 
WANFIS13 (0.43) 0.799 88.150 0.414 
WGEP13 0.811 80.871 0.507 
Land use Single ANN14 (logsig-purelin-5) 0.882 77.849 0.744 
ANFIS14 (0.46) 0.888 74.340 0.767 
GEP14 0.899 69.824 0.794 
W-Hybrid WANN14 (tansig-purelin-4) 0.894 54.931 0.773 
WANFIS14 (0.37) 0.900 50.916 0.805 
WGEP14 0.919 50.796 0.805 
Combined Single ANN15 (tansig-tansig-3) 0.940 53.549 0.879 
ANFIS15 (0.31) 0.974 37.169 0.942 
GEP15 0.983 33.812 0.952 
W-Hybrid WANN15 (tansig-tansig-3) 0.986 20.615 0.968 
WANFIS15 (0.44) 0.991 16.497 0.979 
WGEP15 0.996 15.676 0.981 

a(activation function in hidden layer–activation function in output layer–neuron numbers in hidden layer).

b(radii values).

Table 4

Results of applied models' performance in estimating RWW in the Qazvin study area for the testing section

jFactorsModel typesModelsRRMSE (×103 m3)NSE
Morphologic Single ANN21 (tansig-tansig-2) 0.783 175.696 0.568 
ANFIS21 (0.41) 0.822 154.708 0.665 
GEP21 0.843 153.283 0.671 
W-Hybrid WANN21 (tansig-tansig-5) 0.884 112.081 0.744 
WANFIS21 (0.38) 0.893 111.677 0.746 
WGEP21 0.898 105.532 0.773 
Quantitative hydrologic Single ANN22 (tansig-purelin-3) 0.811 157.653 0.652 
ANFIS22 (0.32) 0.898 124.274 0.784 
GEP22 0.905 116.553 0.810 
W-Hybrid WANN22 (logsig-tansig-3) 0.890 103.929 0.780 
WANFIS22 (0.49) 0.909 101.159 0.791 
WGEP22 0.914 96.615 0.810 
Qualitative hydrologic Single ANN23 (tansig-purelin-3) 0.735 200.801 0.436 
ANFIS23 (0.35) 0.744 185.600 0.518 
GEP23 0.813 162.991 0.629 
W-Hybrid WANN23 (tansig-purelin-4) 0.881 114.095 0.734 
WANFIS23 (0.45) 0.883 106.633 0.768 
WGEP23 0.885 105.018 0.775 
Land use Single ANN24 (tansig-purelin-2) 0.675 206.042 0.406 
ANFIS24 (0.43) 0.691 200.817 0.436 
GEP24 0.706 189.911 0.496 
W-Hybrid WANN24 (tansig-tansig-3) 0.670 176.643 0.364 
WANFIS24 (0.41) 0.719 175.510 0.372 
WGEP24 0.723 160.961 0.472 
Combined Single ANN25 (logsig-purelin-5) 0.926 115.386 0.814 
ANFIS25 (0.34) 0.955 80.644 0.909 
GEP25 0.966 78.017 0.915 
W-Hybrid WANN25 (tansig-tansig-4) 0.963 62.734 0.920 
WANFIS25 (0.44) 0.979 46.565 0.956 
WGEP25 0.990 44.332 0.960 
jFactorsModel typesModelsRRMSE (×103 m3)NSE
Morphologic Single ANN21 (tansig-tansig-2) 0.783 175.696 0.568 
ANFIS21 (0.41) 0.822 154.708 0.665 
GEP21 0.843 153.283 0.671 
W-Hybrid WANN21 (tansig-tansig-5) 0.884 112.081 0.744 
WANFIS21 (0.38) 0.893 111.677 0.746 
WGEP21 0.898 105.532 0.773 
Quantitative hydrologic Single ANN22 (tansig-purelin-3) 0.811 157.653 0.652 
ANFIS22 (0.32) 0.898 124.274 0.784 
GEP22 0.905 116.553 0.810 
W-Hybrid WANN22 (logsig-tansig-3) 0.890 103.929 0.780 
WANFIS22 (0.49) 0.909 101.159 0.791 
WGEP22 0.914 96.615 0.810 
Qualitative hydrologic Single ANN23 (tansig-purelin-3) 0.735 200.801 0.436 
ANFIS23 (0.35) 0.744 185.600 0.518 
GEP23 0.813 162.991 0.629 
W-Hybrid WANN23 (tansig-purelin-4) 0.881 114.095 0.734 
WANFIS23 (0.45) 0.883 106.633 0.768 
WGEP23 0.885 105.018 0.775 
Land use Single ANN24 (tansig-purelin-2) 0.675 206.042 0.406 
ANFIS24 (0.43) 0.691 200.817 0.436 
GEP24 0.706 189.911 0.496 
W-Hybrid WANN24 (tansig-tansig-3) 0.670 176.643 0.364 
WANFIS24 (0.41) 0.719 175.510 0.372 
WGEP24 0.723 160.961 0.472 
Combined Single ANN25 (logsig-purelin-5) 0.926 115.386 0.814 
ANFIS25 (0.34) 0.955 80.644 0.909 
GEP25 0.966 78.017 0.915 
W-Hybrid WANN25 (tansig-tansig-4) 0.963 62.734 0.920 
WANFIS25 (0.44) 0.979 46.565 0.956 
WGEP25 0.990 44.332 0.960 

The values of R, RMSE, and NSE for morphological, quantitative and qualitative hydrological, land use, and combined factors in the Lavasanat study area with Bsk climate classification corresponding to the ANN11 to ANN15 models varied from 0.702 to 0.940, 53.549 to 125.508, and 0.335 to 0.879 respectively. These figures for the Qazvin study area with Csa climate classification corresponding to the ANFIS21 to ANFIS25 models ranged from 0.691 to 0.955, 80.644 to 200.817, and 0.436 to 0.909, respectively. In the current study, the Root Relative Squared Error (RRSE) with a pressure tree is selected as a fitness function for the GEP models. The R and RMSE for the GEP11 to GEP15 and GEP21 to GEP25 models of RWW estimations in the Lavasanat and Qazvin study areas have the values of [R (0.793, 0.928, 0.723, 0.899, 0.983), and RMSE (×103 m3) (105.024, 64.525, 123.352, 69.824, 33.812)] and [R (0.843, 0.905, 0.813, 0.706, 0.966), and RMSE (×103 m3) (153.283, 116.553, 162.991, 189.911, 78.017)], respectively. The results of R, RMSE, and NSE values for ANN, ANFIS, and GEP models to estimate the RWW variable by all aforementioned factors for the Lavasanat and Qazvin study areas are summarized in Tables 3 and 4 for the testing section.

Table 5

Performance scoring of the applied factors (j = 1, 2, …, 5) to estimate RWW in Bsk and Csa climatic categories

Climatic categoryBsk
AspectsANNANFISGEPWANNWANFISWGEP
Scoring
Combined Combined Combined Combined Combined Combined 
Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic 
Land use Land use Land use Land use Land use Land use 
Morphologic Morphologic Morphologic Morphologic Morphologic Morphologic 
Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic 
Climatic categoryCsa
Methods ScoringANNANFISGEPWANNWANFISWGEP
Combined Combined Combined Combined Combined Combined 
Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic 
Qualitative hydrologic Qualitative hydrologic Morphologic Morphologic Morphologic Morphologic 
Morphologic Morphologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic 
Land use Land use Land use Land use Land use Land use 
Climatic categoryBsk
AspectsANNANFISGEPWANNWANFISWGEP
Scoring
Combined Combined Combined Combined Combined Combined 
Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic 
Land use Land use Land use Land use Land use Land use 
Morphologic Morphologic Morphologic Morphologic Morphologic Morphologic 
Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic 
Climatic categoryCsa
Methods ScoringANNANFISGEPWANNWANFISWGEP
Combined Combined Combined Combined Combined Combined 
Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic Quantitative hydrologic 
Qualitative hydrologic Qualitative hydrologic Morphologic Morphologic Morphologic Morphologic 
Morphologic Morphologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic Qualitative hydrologic 
Land use Land use Land use Land use Land use Land use 

Wavelet–hybrid values for estimating RWW variable

The W-hybrid (WANNij, WANFISij, and WGEPij) data-driven models are developed via wavelet tools for improving the performance of the single (ANNij, ANFISij, and GEPij) data-driven models for estimating RWW in the Lavasanat and Qazvin study areas. In the current method, the first stage is to divide and decompose the used variables of morphological, quantitative and qualitative hydrological, and land use factors into the subseries of main and detail (Al and Dl series) by applying a Wavelet-db4 tool. To make and develop the W-hybrid RWW estimated models, Dl=1, 2 decomposed subseries are introduced as noise and removed from the models. After de-noising, decomposed subseries values are estimated separately with single models. The results of optimal W-hybrid models, WANN, WANFIS, and WGEP, by R, RMSE, and NSE values and specifications of models for various factors including morphological, quantitative and qualitative hydrological, land use, and combined, for the Lavasanat and Qazvin study areas with Bsk and Csa climate type are listed in Tables 3 and 4 for the testing section, respectively. The values of R and RMSE for estimating RWW corresponding to the WANN11 to WANN15 and WANN21 to WANN25 models varied from 0.773 to 0.986 and 20.615 to 93.470 in the Lavasanat, 0.670 to 0.963 and 62.734 to 176.643 in the Qazvin study areas, respectively. On the other hand, the values of R and RMSE corresponding to the WANFIS11 to WANFIS15 and WANFIS21 to WANFIS25 models for RWW ranged from 0.799 to 0.991 and 16.497 to 88.150 in the Lavasanat, 0.719 to 0.979 and 46.565 to 175.510 in the Qazvin study areas. Finally, the values of R and RMSE for the WGEP11 to WGEP15 and WGEP21 to WGEP25 models of RWW estimates in the Lavasanat and Qazvin study areas figure at [R (0.821, 0.932, 0.811, 0.919, 0.996) and RMSE (×103 m3) (67.268, 54.659, 80.871, 50.796, 15.676)] and [R (0.898, 0.914, 0.885, 0.723, 0.990) and RMSE (×103 m3) (105.532, 96.615, 105.018, 160.961, 44.332)], respectively.

Figures 46 show the observed and estimated RWW values for the single models and their W-hybrid models based on various factors for the study areas in the testing section. The R values of the developed data-driven models are close to 1, while the relation among R values for all applied factors of RWW estimation models in the study areas is as follows: RGEP> RANFIS > RANN and RWGEP> RWANFIS > RWANN. For instance, the percentage of WGEP performance improvement compared with GEP for morphological, quantitative and qualitative hydrological, land use, and combined factors are 35.95%, 15.29%, 34.44%, 27.25%, and 53.64% in the Lavasanat and 31.15%, 17.11%, 35.57%, 15.24%, and 43.18% in the Qazvin study areas.

Figure 4

Observed and estimated RWW values based on morphological, quantitative and qualitative hydrological, and land use factors using single and W-hybrid methods in the Lavasanat study area for the testing section (i = 1, j = 1, …, 4).

Figure 4

Observed and estimated RWW values based on morphological, quantitative and qualitative hydrological, and land use factors using single and W-hybrid methods in the Lavasanat study area for the testing section (i = 1, j = 1, …, 4).

Close modal
Figure 5

Observed and estimated RWW values based on morphological, quantitative and qualitative hydrological, and land use factors using single and W-hybrid methods in the Qazvin study area for the testing section (i = 2, j = 1, …, 4).

Figure 5

Observed and estimated RWW values based on morphological, quantitative and qualitative hydrological, and land use factors using single and W-hybrid methods in the Qazvin study area for the testing section (i = 2, j = 1, …, 4).

Close modal
Figure 6

Observed and estimated RWW values based on variables of combined factors using single and W-hybrid methods in study areas for the testing section (i = 1, 2, j = 5).

Figure 6

Observed and estimated RWW values based on variables of combined factors using single and W-hybrid methods in study areas for the testing section (i = 1, 2, j = 5).

Close modal

The results comparison of the applied models for various factors in the Lavasanat and Qazvin study areas are shown in Figures 7 and 8, respectively. The results establish that the ANFIS model exceeds the ANN model's performance; the WANFIS model surpasses the WANNs; and also, the GEP and WGEP models have a better performance for RWW estimating than the two other single and W-hybrid applied models for both study areas with various factors and climates for the testing section.

Figure 7

Results comparison of applied models with various factors to estimate RWW in the Lavasanat study area for the testing section.

Figure 7

Results comparison of applied models with various factors to estimate RWW in the Lavasanat study area for the testing section.

Close modal
Figure 8

Results comparison of applied models with various factors to estimate RWW in the Qazvin study area for the testing section.

Figure 8

Results comparison of applied models with various factors to estimate RWW in the Qazvin study area for the testing section.

Close modal

The Taylor diagram reflects the degree quantification between the observations and estimations in terms of the R, RMSE, and the standard deviation (Taylor 2005). The comparison of the WGEP model's performance for estimating RWW with various factors in the Lavasanat and Qazvin study areas is shown in Figure 9. Based on the reflected results of Figure 9, the combined factor, including the RW, QMin, QMean, EC, and CA variables, has the best performance in estimating RWW values compared with the other developed factors in both study areas.

Figure 9

Comparing the performance of the WGEP models for estimating RWW with various factors in the Lavasanat and Qazvin study areas for the testing section.

Figure 9

Comparing the performance of the WGEP models for estimating RWW with various factors in the Lavasanat and Qazvin study areas for the testing section.

Close modal

One of the most popular charts that show many descriptive statistics of estimated RWW data is the box-plot, which is based on the five values of ‘minimum’, ‘first quartile (0.25%)’, ‘median (0.50%)’, ‘third quartile (0.75%)’ and ‘maximum’ values. Furthermore, depicting symmetry in the data is one of the tasks of this chart. It is worth noting that the degree of focus and even skewness of the data can be seen in this chart. Figure 10 displays the boxplots of estimated RWW values based on the combined factor for Bsk and Csa climate type, respectively. On account of the chart's results, the estimated RWW values of the Bsk climate category have the maximum values of the median and the lower variation coefficient of estimated data compared with the Csa climate category.

Figure 10

Boxplots of optimal RWW estimated values of WGEP with the combined factor for Bsk and Csa climate categories.

Figure 10

Boxplots of optimal RWW estimated values of WGEP with the combined factor for Bsk and Csa climate categories.

Close modal

Figure 11 shows the confidence and prediction bands at 95% of the WGEP models (combined factor) in the Lavasanat and Qazvin study areas. The finding of the aforementioned intervals showed the reliability of the WGEP model's prediction with the combined factor for estimating the RWW variable. The prediction band indicates uncertainty in the true position of the curve (enclosed by the confidence bands), and also accounts for the scatter of data around the curve, while the confidence interval indicates how well the research data defines the best-fit curve. Estimating uncertainty enables water managers to analyze and discover a wide range of sustainable management practices and identify the ones that are most robust for all factors (Adib et al. 2019). On the other hand, for more details about the input variables' effect on RWW estimation, Tornado sensitivity analysis of the WGEP models (combined factor) at 5% and 95% is plotted in Figure 11. The highest and lowest sensitivity of the RWW estimation models at 95% are related to the (QMean, RW) and (QMean, EC) input variables in the Lavasanat and Qazvin study areas, respectively.

Figure 11

Confidence and prediction bands (95%) and Tornado sensitivity at 5% and 95% of WGEP (combined factor) for the Lavasanat and Qazvin study areas.

Figure 11

Confidence and prediction bands (95%) and Tornado sensitivity at 5% and 95% of WGEP (combined factor) for the Lavasanat and Qazvin study areas.

Close modal

The unique structure of each data-driven method is the main reason for the models' efficiency difference. Gene and chromosome production in the GEP method makes the efficiency of this method considerably better than the other models. Also, the ANFIS method, by combining fuzzy rules and neural neuron structure, has relatively high efficiency compared with the ANN method. The efficiency of single models is significantly increased in all data-driven methods and climatic categories by combining wavelet theory and de-noising of data and creating complex nonlinear relationships in its structure. On the other hand, climatic categories only affect the models' performance values, not their priority. However, the results of single and W-hybrid methods could be acceptable for estimating the RWW variable in the Lavasanat and Qazvin study areas. Table 5 summarized the performance of applied factors to estimate RWW by scoring the studied areas in the Bsk and Csa climate categories for the testing section. The results indicated that the WGEP and ANN models respectively are the best and poorest models in both study areas without the effect of climate conditions. Also, a combined factor which includes RW, QMin, QMean, EC, and CA variables was introduced as the best model to estimate RWW variables compared with other factors in both the Bsk and Csa climate categories. On the other hand, qualitative hydrological and land use factors were the weakest factors for estimating RWW variables in the study areas. The performance scoring of the applied models to estimate the RWW in the Bsk and Csa climatic categories are as follow: WGEP, WANFIS, WANN, GEP, ANFIS, and ANN models.

The performance of the W-hybrid models based on R-values for Bsk and Csa climatic categories with the combined factor for three ranges (30%Min, 40%Mid, 30%Max) are listed in Table 6. The three ranges of 30%Min, 40%Mid, and 30%Max are related to RWW 121.61, 121.61 < RWW < 220.50, and RWW220.50 for the Bsk climatic condition and RWW381.18, 381.18 < RWW < 1,568.19, and RWW1,568.19 for the Csa climatic condition in the testing section. The performances of the WGEP models in the three ranges are highly acceptable (R > 0.900) for estimating RWW in both climatic conditions.

Table 6

Performance (R-values) of W-hybrid models to estimate RWW with combined factor in three ranges for Bsk and Csa climatic categories for the testing section

Climatic categoryBsk
MethodsWGEP15WANFIS15WANN15
30%Min (RWW121.61a0.967 0.930 0.922 
40%Mid (121.61 < RWW < 220.50) 0.941 0.928 0.921 
30%Max (RWW220.50) 0.995 0.988 0.983 
Climatic categoryCsa
MethodsWGEP25WANFIS25WANN25
30%Min (RWW381.18) 0.977 0.638 0.622 
40%Mid (381.18 < RWW < 1568.19) 0.919 0.912 0.812 
30%Max (RWW1568.19) 0.973 0.906 0.841 
Climatic categoryBsk
MethodsWGEP15WANFIS15WANN15
30%Min (RWW121.61a0.967 0.930 0.922 
40%Mid (121.61 < RWW < 220.50) 0.941 0.928 0.921 
30%Max (RWW220.50) 0.995 0.988 0.983 
Climatic categoryCsa
MethodsWGEP25WANFIS25WANN25
30%Min (RWW381.18) 0.977 0.638 0.622 
40%Mid (381.18 < RWW < 1568.19) 0.919 0.912 0.812 
30%Max (RWW1568.19) 0.973 0.906 0.841 

a(×103m3).

On the other hand, the most important advantage of the GEP and WGEP compared with the other applied data-driven methods is in developing and extracting predictive equations. The driving mathematical equations from the GEP and WGEP models for estimating RWW in the Lavasant and Qazvin study areas are listed in Table 7. The mathematical equations can be used at various spatial–temporal scales. The R values of the validation section of the extracted mathematical equations were obtained as 0.865 and 0.897, respectively in Lavasanat and Qazvin for the combined WGEP models. The results indicated the capacity of applying the extracted equations in various basins, which has significant impact on river basin management.

Table 7

Driving mathematical equations from GEP and WGEP models for estimating RWW for the testing section

ijModelsEquations
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
ijModelsEquations
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  
GEP  
WGEP  

Vafakhah & Bozchaloei (2020) analyzed the flow duration curves through ANN and SVR methods in the Namak Lake basin using Height (H), Area (A), Rangeland Area (RA), Drainage Density (DD), Permeable Formation (PF), and average Stream Slope (SS) variables. Their finding indicated that the R2 values of ANN and SVR for the best models were obtained as 0.94 and 0.96, respectively. In our research the best values of R were 0.996 and 0.990 for Lavasanat and Qazvin, respectively. So, the sampling data of various related variables has the ability to estimate the hydrological variables in the Namak Lake basin. The results of the current study are in line with the findings of Adamowski & Sun (2010), Montaseri et al. (2018), and Kumar et al. (2020), regarding the superior performance of W-hybrid models in estimating quantitative and qualitative hydrological variables.

In recent years, lack of proper water resources management, growing demand for natural resource consumption, population growth, and climate change have put a huge tension on water resources. As a result, the scarcity of water resources makes it the most important challenge for mankind in the last decade. The main purpose of this study is to model river water withdrawal for agricultural uses in the Lavasanat and Qazvin study areas with different climatic conditions, while considering morphological, quantitative and qualitative hydrological, and land use factors. Saving time and money has been the reason for applying data-driven methods of hybrid evolution instead of traditional methods by researchers. Therefore, in this study, ANN, WANN, ANFIS, WANFIS, GEP, and WGEP models were used to estimate the amount of the RWW variable according to the various factors. Climate change and consequently changes in surface water resources with the influence of nonlinear multivariate conditions on morphological, hydrological, and land use factors have led researchers to use nonlinear models. Employing nonlinear methods such as hybrid data-driven evolution methods is one of the ways to overcome this problem. The results of this study showed that the combination of data-driven models with wavelet theory can improve the performance of models. Also, this study confirms the high dependence of RWW for agricultural uses on morphological, quantitative and qualitative hydrological, and land use factors. The best way to estimate the amount of RWW is to combine the more effective variables of the four mentioned factors. Since the used variables in this study were collected via field sampling from the study area, the researchers faced limitations in measuring and collecting all the variables related to the selected factors for estimating RWW in the study areas. Therefore, it is recommended to estimate the amount of RWW by measuring more variables related to morphological, quantitative and qualitative hydrological, and land use factors. On the other hand, the RWW for agricultural uses could be affected by more characteristics such as weather variables, population, etc. which are suggested to be included for future research. The use of other data-driven models such as MODWT and MODWT-MRA (for decomposing the data) to estimate the RWW could also be very useful in water resources management at the basin scale.

The authors thank Sari Agricultural Science and Natural Resources University (SANRU) for its financing support [Grant No. 02-1399-26].

None.

All relevant data are included in the paper or its Supplementary Information.

Abbaspour
K. C.
,
Faramarzi
M.
,
Ghasemi
S. S.
&
Yang
H.
2009
Assessing the impact of climate change on water resources in Iran
.
Water Resources Research
45
(
10
),
W10434
.
Barzegar
R.
,
Adamowski
J.
&
Moghaddam
A. A.
2016
Application of wavelet–artificial intelligence hybrid models for water quality prediction: a case study in Aji-Chay River, Iran
.
Stochastic Environmental Research and Risk Assessment
30
(
7
),
1797
1819
.
Birbal
P.
,
Azamathulla
H.
,
Leon
L.
,
Kumar
V.
&
Hosein
J.
2021
Predictive modelling of the stage–discharge relationship using Gene-Expression Programming
.
Water Supply
,
doi: 10.2166/ws.2021.111
.
Bissenbayeva
S.
,
Abuduwaili
J.
,
Saparova
A.
&
Ahmed
T.
2021
Long-term variations in runoff of the Syr Darya River Basin under climate change and human activities
.
Journal of Arid Land
13
(
1
),
56
70
.
Böhme
B.
,
Becker
M.
,
Diekkrüger
B.
&
Förch
G.
2016
How is water availability related to the land use and morphology of an inland valley wetland in Kenya?
Physics and Chemistry of the Earth, Parts A/B/C
93
,
84
95
.
Chen
Y.
,
Song
L.
,
Liu
Y.
,
Yang
L.
&
Li
D.
2020
A review of the artificial neural network models for water quality prediction
.
Applied Sciences
10
(
17
),
5776
.
Eris
E.
,
Aksoy
H.
,
Onoz
B.
,
Cetin
M.
,
Yuce
M. I.
,
Selek
B.
,
Aksu
H.
,
Burgan
H. I.
,
Esit
M.
,
Yildirim
I.
&
Karakus
E. U.
2019
Frequency analysis of low flows in intermittent and non-intermittent rivers from hydrological basins in Turkey
.
Water Supply
19
(
1
),
30
39
.
Eris
E.
,
Cavus
Y.
,
Aksoy
H.
,
Burgan
H. I.
,
Aksu
H.
&
Boyacioglu
H.
2020
Spatiotemporal analysis of meteorological drought over Kucuk Menderes River Basin in the Aegean Region of Turkey
.
Theoretical and Applied Climatology
142
(
3
),
1515
1530
.
Ferreira
C.
2001
Gene expression programming: a new adaptive algorithm for solving problems
.
Complex Systems
13
(
2
),
87
129
.
Ferreira
C.
2006
Designing neural networks using gene expression programming
. In:
Applied Soft Computing Technologies: The Challenge of Complexity
(A. Abraham, B. de Baets, M. Köppen & B. Nickolay, eds)
,
Springer
,
Berlin, Germany
, pp.
517
535
.
Ghalehkhondabi
I.
,
Ardjmand
E.
,
Young
W. A.
&
Weckman
G. R.
2017
Water demand forecasting: review of soft computing methods
.
Environmental Monitoring and Assessment
189
(
7
),
313
.
Ghavidel
S. Z. Z.
&
Montaseri
M.
2014
Application of different data-driven methods for the prediction of total dissolved solids in the Zarinehroud basin
.
Stochastic Environmental Research and Risk Assessment
28
(
8
),
2101
2118
.
Hohensinner
S.
,
Hauer
C.
&
Muhar
S.
2018
River morphology, channelization, and habitat restoration
. In:
Riverine Ecosystem Management
(S. Schmutz & J. Sendzimir, eds)
,
Springer
,
Cham, Switzerland
, pp.
41
65
.
Available from: http://wrbs.wrm.ir/
Jeihouni
E.
,
Mohammadi
M.
,
Eslamian
S.
&
Zareian
M. J.
2019
Potential impacts of climate change on groundwater level through hybrid soft-computing methods: a case study – Shabestar Plain, Iran
.
Environmental Monitoring and Assessment
191
(
10
),
620
.
Kumar
M.
,
Kumari
A.
,
Kushwaha
D. P.
,
Kumar
P.
,
Malik
A.
,
Ali
R.
&
Kuriqi
A.
2020
Estimation of daily stage–discharge relationship by using data-driven techniques of a perennial river, India
.
Sustainability
12
(
19
),
7877
.
Kumar
Y. V.
&
Mehta
D. J.
2020
Water productivity enhancement through controlling the flood inundation of the surrounding region of Navsari Purna River, India. Water Productivity Journal 1 (2), 11–20
.
Magritsky
D. V.
,
Frolova
N. L.
&
Pakhomova
O. M.
2020
Potential hydrological restrictions on water use in the basins of rivers flowing into Russian Arctic Seas
.
Geography, Environment, Sustainability
13
(
2
),
25
34
.
Mallat
S. G.
1989
A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 11 (7), 674–693
.
Manian
V.
&
Vásquez
R.
1998
Scaled and rotated texture classification using a class of basis functions
.
Pattern Recognition
31
(
12
),
1937
1948
.
Mehta
D. J.
&
Yadav
S. M.
2020
Hydrodynamic simulation of river Ambica for riverbed assessment: a case study of Navsari Region
. In:
Advances in Water Resources Engineering and Management
(R. AlKhaddar, R. K. Singh, S. Dutta & M. Kumari, eds)
,
Springer
,
Singapore
, pp.
127
140
.
Mehta
D.
,
Yadav
S. M.
&
Waikhom
S.
2013
Geomorphic channel design and analysis using HEC-RAS hydraulic design functions
.
Global Research Analysis
2
(
4
),
90
93
.
Mehta
D. J.
,
Ramani
M.
&
Joshi
M.
2014
Application of 1-D HEC-RAS model in design of channels
.
International Journal of Innovative Research in Advanced Engineering
1
(
7
),
103
107
.
Mehta
D.
,
Yadav
S. M.
,
Waikhom
S.
&
Prajapati
K.
2020
Stable channel design of Tapi River using HEC-RAS for Surat Region
. In:
Environmental Processes and Management
(R. M. Singh, P. Shukla & P. Singh, eds)
,
Springer
,
Cham
,
Switzerland
, pp.
25
36
.
Montaseri
M.
,
Ghavidel
S. Z. Z.
&
Sanikhani
H.
2018
Water quality variations in different climates of Iran: toward modeling total dissolved solid using soft computing techniques
.
Stochastic Environmental Research and Risk Assessment
32
(
8
),
2253
2273
.
Msigwa
A.
,
Komakech
H. C.
,
Verbeiren
B.
,
Salvadore
E.
,
Hessels
T.
,
Weerasinghe
I.
&
van Griensven
A.
2019
Accounting for seasonal land use dynamics to improve estimation of agricultural irrigation water withdrawals
.
Water
11
(
12
),
2471
.
Patel
S. B.
,
Mehta
D. J.
&
Yadav
S. M.
2018
One dimensional hydrodynamic flood modeling for Ambica River, South Gujarat
.
Journal of Emerging Technologies and Innovative Research
5
(
4
),
595
601
.
Rahmanpanah
H.
,
Mouloodi
S.
,
Burvill
C.
,
Gohari
S.
&
Davies
H. M. S.
2020
Prediction of load–displacement curve in a complex structure using artificial neural networks: a study on a long bone
.
International Journal of Engineering Science
154
,
103319
.
Rajaee
T.
,
Khani
S.
&
Ravansalar
M.
2020
Artificial intelligence-based single and hybrid models for prediction of water quality in rivers: a review
.
Chemometrics and Intelligent Laboratory Systems
200
,
103978
.
Sharafati
A.
,
Nabaei
S.
&
Shahid
S.
2020
Spatial assessment of meteorological drought features over different climate regions in Iran
.
International Journal of Climatology
40
(
3
),
1864
1884
.
Sheikh
Z.
,
Yazdani
M. R.
&
Nia
A. M.
2020
Spatiotemporal changes of 7-day low flow in Iran's Namak Lake Basin: impacts of climatic and human factors
.
Theoretical and Applied Climatology
139
(
1
),
57
73
.
Shirmohammadi
B.
,
Malekian
A.
,
Salajegheh
A.
,
Taheri
B.
,
Azarnivand
H.
,
Malek
Z.
&
Verburg
P. H.
2020
Scenario analysis for integrated water resources management under future land use change in the Urmia Lake region, Iran
.
Land Use Policy
90
,
104299
.
Sun
Y.
,
Niu
J.
&
Sivakumar
B.
2019
A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach
.
Stochastic Environmental Research and Risk Assessment
33
(
10
),
1875
1891
.
Takagi
T.
&
Sugeno
M.
1985
Fuzzy identification of systems and its applications to modeling and control
.
IEEE Transactions on Systems, Man, and Cybernetics
SMC-15
, (
1
),
116
132
.
Vafakhah
M.
&
Bozchaloei
S. K.
2020
Regional analysis of flow duration curves through support vector regression
.
Water Resources Management
34
(
1
),
283
294
.
Vondracek
B.
,
Blann
K. L.
,
Cox
C. B.
,
Nerbonne
J. F.
,
Mumford
K. G.
,
Nerbonne
B. A.
,
Sovell
L. A.
&
Zimmerman
J. K. H.
2005
Land use, spatial scale, and stream systems: lessons from an agricultural region
.
Environmental Management
36
(
6
),
775
791
.
Yaseen
Z. M.
,
Awadh
S. M.
,
Sharafati
A.
&
Shahid
S.
2018
Complementary data-intelligence model for river flow simulation
.
Journal of Hydrology
567
,
180
190
.
ZamanZad-Ghavidel
S.
,
Bozorg-Haddad
O.
&
Goharian
E.
2020
Sustainability assessment of water resource systems using a novel hydro-socio-economic index (HSEI)
.
Environment, Development and Sustainability
23
(
2
),
1869
1916
.
Zhang
Z.
,
Zhang
Q.
&
Singh
V. P.
2018
Univariate streamflow forecasting using commonly used data-driven models: literature review and case study
.
Hydrological Sciences Journal
63
(
7
),
1091
1111
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).