## Abstract

One of the most important bases in the management of catchments and sustainable use of water resources is the prediction of hydrological parameters. In this study, support vector machine (SVM), support vector machine combined with wavelet transform (W-SVM), autoregressive moving average with exogenous variable (ARMAX) model, and autoregressive integrated moving average (ARIMA) models were used to predict monthly values of precipitation, discharge, and evaporation. For this purpose, the monthly time series of rain-gauge, hydrometric, and evaporation-gauge stations located in the catchment area of Hamedan during a 25-year period (1991–2015) were used. Out of this statistical period, 17 years (1991–2007), 4 years (2008–2011), and 4 years (2012–2015) were used for training, calibration, and validation of the models, respectively. The results showed that the ARIMA, SVM, ARMAX, and W-SVM ranked from first to fourth in the monthly precipitation prediction and SVM, ARIMA, ARMAX, and W-SVM were ranked from first to fourth in the monthly discharge and monthly evaporation prediction. It can be said that the SVM has fewer adjustable parameters than other models. Thus, the model is able to predict hydrological changes with greater ease and in less time, because of which it is preferred to other methods.

## ABBREVIATIONS

- SVM
Support vector machine

- W-SVM
Support vector machine combined with wavelet transform

- ARMAX
Autoregressive moving average with exogenous variable

- ARIMA
Autoregressive integrated moving average

- ANN
Artificial neural network

- GA
Genetic algorithms

- GP
Genetic programming

- NF
Fuzzy neural network

- LS-SVR
Least squared support vector regression

- ANFIS
Adaptive neuro-fuzzy inference system

*r*Correlation coefficient

*RMSE*Root mean square error

*SE*Standard error

## INTRODUCTION

The shortage or lack of the proper distribution of water is one of the biggest concerns of the present century which will be considered as one of the problems of humanity in the future. Although about 70% of the Earth's surface is covered with water, water crisis in many countries of the world, including countries in the dry belt of the Earth, such as Iran, takes on a more complex dimension every day. On the other hand, with the decreasing quantity and the quality degradation of water, each year, water resources are constrained, and consumption of and demand for water are constantly increasing. Therefore, the sustainable exploitation of this resource requires proper management. In order to achieve this correct management, precise prediction of hydrological parameters such as precipitation, discharge, and evaporation is required. Kisi & Cimen (2012) showed that using a hybrid model increases the accuracy in prediction of daily precipitation. Shafaei *et al.* (2016) found that the wavelet-neural network and wavelet-SARIMA hybrid models have a better performance than the neural network and SARIMA in prediction of precipitation. Hamidi *et al.* (2014), in order to predict monthly precipitation, used the two models of SVM and artificial neural network (ANN), and found that the SVM is more efficient than the ANN. Shenify *et al.* (2015) showed that the combination of the wavelet and SVM model performs better than the ANN and genetic algorithms (GA) to estimate monthly precipitation. On the other hand, 57% of the water that falls on land as precipitation is directly evaporated, and much of the annual precipitation in arid and semi-arid climates, which is a predominant feature of Iran's climate, immediately goes back to the atmosphere. One of the important and effective factors in water resource planning and management in arid and semi-arid regions and one of the most important atmospheric factors is evaporation. Estimating evaporation in planning and management of water resources in agriculture, determining the pattern of cultivation, and management of water reservoirs is highly significant (Bazrafshan *et al.* 2017). Shiri & Kisi (2011) used genetic programming (GP), fuzzy neural network (NF), and ANN models to predict daily evaporation. Their results showed that the GP model has more capability than the other two models. Goyala *et al.* (2014) found that the LS-SVR and fuzzy logic models have very good performance in prediction of daily evaporation. In another study, Pammar & Deka (2015) used the two models of SVM and W-SVM to predict daily evaporation, and their results indicate higher prediction accuracy of the W-SVM compared to the SVM. Moreover, discharge, as one of the basic parameters, in addition to the importance of design and planning, can be a major challenge for the country in terms of security. In recent decades, flow forecasting, as one of the most important challenges in water resources management, has compelled researchers to present and apply different computerized models in various studies (Noori *et al.* 2011). Among the research done in the field of prediction and modeling of discharge, the following can be mentioned. Danandeh Mehr *et al.* (2013) showed that the genetic linear programming model has a higher performance than the wavelet neural network model in monthly discharge prediction. Adnan *et al.* (2017a, 2017b) used ARMA and ARIMA models to predict monthly flow and their results indicated that the ARIMA model has better performance than the ARMA model. Ravansalar *et al.* (2017) compared linear genetic programming combined with wavelet transform (WLGP), genetic linear programming, ANN, and wavelet neural network models to predict monthly discharge. The results showed that the WLGP model significantly increases the accuracy of prediction. Adnan *et al.* (2017a, 2017b) found that the SVM model was more powerful than the ANN in predicting monthly flow.

By reviewing various literatures, it can be said that researchers have examined and recommended different models to predict time series. However, so far, no comparison has been made between the results of the recommended models. On the other hand, a study that evaluates the accuracy of these models in predicting the three significant hydrologic parameters of precipitation, evaporation, and discharge simultaneously has not yet been conducted. Thus, the purpose of this study was to evaluate the performance of the SVM, W-SVM, ARMAX, and ARIMA models in predicting monthly precipitation, evaporation, and discharge, and to introduce the most suitable model in order to predict each of these parameters whose results can be used by managers and planners in the water sector.

## METHODS

### The study area

Hamedan province covers an area of 20,172 square meters, which accounts for 2.1% of the total area of the country. It is located between 59° and 33′ to 49° and 35′ of northern latitude and 34° and 47′ to 34° and 49′ of eastern longitude from the Greenwich meridian. The selected and studied stations in this study include the hydrometric stations of Taghsime Aab, Sulan, and Yalfan, the rain-gauge stations of Ekbatan dam, Maryanaj, and Aghajan Bolaghi, and the evaporation-gauge stations of Ekbatan dam, Ghahavand, and Kushk Abad, all located in the Ghare Chay Basin. In Figure 1 the location and in Table 1 the characteristics of these stations are shown. These stations were chosen differently for precipitation, discharge and evaporation because there was no station where all parameters (precipitation, discharge, and evaporation) were recorded. The time series diagrams of all the parameters are plotted in Figure 2.

Row | Station type | River name | Station name | Latitude | Longitude | Height |
---|---|---|---|---|---|---|

1 | Hydrometric | Abas abad | Taghsimeab | 34-45-58 | 48-57-15 | 2,088 |

2 | Hydrometric | Maryanaj | Sulan | 34-49-58 | 48-25-03 | 1,979 |

3 | Hydrometric | Aabshineh | Yalfan | 34-43-47 | 48-36-41 | 1,999 |

4 | Rain-gauge | – | Ekbatan dam | 36-45-34 | 48-36-11 | 1,957 |

5 | Rain-gauge | – | Aghajan Bolaghi | 34-50-53 | 48-03-07 | 1,802 |

6 | Rain-gauge | – | Maryanaj | 34-49-41 | 48-27-28 | 1,841 |

7 | Evaporation-gauge | – | Ekbatan dam | 36-45-34 | 48-36-11 | 1,957 |

8 | Evaporation-gauge | – | Ghahavand | 34-51-42 | 48-59-55 | 1,554 |

9 | Evaporation-gauge | – | Kushk Abad | 35-02-09 | 48-33-47 | 1,702 |

Row | Station type | River name | Station name | Latitude | Longitude | Height |
---|---|---|---|---|---|---|

1 | Hydrometric | Abas abad | Taghsimeab | 34-45-58 | 48-57-15 | 2,088 |

2 | Hydrometric | Maryanaj | Sulan | 34-49-58 | 48-25-03 | 1,979 |

3 | Hydrometric | Aabshineh | Yalfan | 34-43-47 | 48-36-41 | 1,999 |

4 | Rain-gauge | – | Ekbatan dam | 36-45-34 | 48-36-11 | 1,957 |

5 | Rain-gauge | – | Aghajan Bolaghi | 34-50-53 | 48-03-07 | 1,802 |

6 | Rain-gauge | – | Maryanaj | 34-49-41 | 48-27-28 | 1,841 |

7 | Evaporation-gauge | – | Ekbatan dam | 36-45-34 | 48-36-11 | 1,957 |

8 | Evaporation-gauge | – | Ghahavand | 34-51-42 | 48-59-55 | 1,554 |

9 | Evaporation-gauge | – | Kushk Abad | 35-02-09 | 48-33-47 | 1,702 |

### The SVM model

*y*) and independent values is expressed in the form of a certain function (

*f*) plus an additional value, and its main purpose is to find the form of the function

*f*in order to predict correctly the cases that SVM has not experienced yet (Noori

*et al.*2009). In fact, it can be said that the SVM is an efficient learning system that uses a deductive principle of structural error minimization in order to reach an optimal answer (Cristianini & Shawe-Taylor 2000). The two prominent features of excellent generalization capability and compatibility with scattered and low data help the SVM to provide useful predictions (Behzad

*et al.*2010). The SVM has two types of regressions: type I and type II, that are shown respectively by SVM-

*v*and SVM-

*ɛ*; notably, the latter type is more applicable in regression problems. As a matter of fact, this model minimizes the following error (Equation (1)) by considering some limitations (Equation (2)) (Hamel 2009). Here,

*C*is the capacity constant,

*w*is the vector of coefficients,

*b*is a constant, and represents parameters for handling non-separable data (inputs). The index

*i*labels the

*N*training cases. The kernel is used to transform data from the input (independent) to the feature space. This model can solve nonlinear problems by changing the dimensions of the problem by means of kernel functions. Choosing the right kernel depends on the volume of the training data and the dimensions of the attribute vector. In practice, there are four types of kernels (Yu

*et al.*2006); the names and mathematical formulas of these kernels are presented in Table 2. In these relations,

*γ*and

*C*are the kernel-related parameters and

*d*is the polynomial degree.

Kernel | Formula |
---|---|

Linear | |

Polynomial | |

Hyperbolic tangent | |

RBF |

Kernel | Formula |
---|---|

Linear | |

Polynomial | |

Hyperbolic tangent | |

RBF |

### Wavelet transform

Wavelet transform is one of the most efficient mathematical transformations in the field of signal analysis and information extraction that cannot be easily achieved through raw signals. Time series are composed of two parts: approximation and detail. Approximation represents the overall signal process and detail represents minor changes in it. Studies show that detecting these two signals can be a great help in predicting the model, but these changes might not be distinguishable by prediction models. Wavelet transform can separate approximation and detail sections through signal decomposition at different levels, and, consequently, can introduce parts of the original signal which cannot be easily detected by the prediction models in the form of separate signals to these models.

*et al.*2005). Using the two transfer and scale operators, resizing and changing the location of the mother wavelet along the signal of the study has been done and is expressed as follows (Kisi 2009):

*a*is the scale or frequency parameter,

*b*is the transmission or time parameter,

*R*is the range of real numbers, and is the mother wavelet. Each wavelet has three characteristics, which are also called the acceptability condition that includes a limited number of oscillations, fast return to zero in both positive and negative directions in its range, and the mean. In general, wavelet transform has two continuous (

*cwt*) and discrete (

*dwt*) forms. Continuous wavelet transform is expressed in the following equation (Polikar 1996):

*s*and

*τ*,

*τ*represents transition,

*s*represents scale (opposed to frequency), and the sign * represents the complex conjugate symbol. While using continuous wavelet transform, due to the change in transmission parameters and the scale over time, the amount of information increases and, therefore, to perform calculations of wavelet transformation using digital computers, discrete wavelet transform is used with very good features, such as providing sufficient information to analyze the wave and simplifying the implementation by allocating discrete values to

*a*and

*b*parameters (Roshangar

*et al.*2015). In the analysis of time series, discrete wavelet transformation is more suitable than continuous transformation because of the absence of additional components in the data transformed by it. In practical applications, most of the temporal signals given to the hydrologists are discrete. Discrete wavelet transform is defined as follows (Merry 2005):

In discrete transformation, the initial information (the main signal) is divided into two categories of approximation and details. The approximation category contains low-frequency signals and shows the overall trend, and the details category contains high-frequency signals and expresses limited variations in data (Cannas *et al.* 2006).

### ARIMA models

*p, q*) model has led to a new series of statistical models known as non-seasonal ARIMA models ARIMA (

*p, d, q*). The use of seasonal difference function with seasonal period

*ω*and their fitting with ARMA (

*p, q*) models resulted in the creation of ARIMA seasonal models, SARIMA(

*p,d,q*)(

*P,D,Q*)

*ω*, and the combination of seasonal and non-seasonal models called multiplicative ARIMA models. The simple ARIMA model for the

*x*temporal series is obtained from the ARMA model fitting on its differentiation series (

_{t}*u*). In this case, ARIMA will be as follows (Karamouz & Araghinejad 2005): where

_{t}*u*is the result of the d-th difference of the main series

_{t}*x*. In general, the ARIMA model is a discrete time linear equation with the form (Mahan

_{t}*et al.*2015): where

*ɛ*is a white noise process and

_{t}*L*is the time lag operator,

*Lx*.

_{t}= x_{t-1}### The ARMAX model

*y(t)*is the output variable,

*u(t-nk)*is the input variable,

*nk*is the lag delay between the input and output,

*e(t)*is the system disturbance,

*A(q)*,

*B(q)*, and

*C(q)*are polynomial with respect to the operator

*q*.

In this method, contrary to other methods of time series, in which only the information of the predicted parameter is used, other parameters correlated with the studied parameter are also utilized and this is an essential advantage of this method compared to other time series methods (Omidi *et al.* 2014).

### Evaluation criteria

In these expressions, is the observed value, is the predicted value, is the mean of the observed value, is the mean of the predicted value, and *n* is the number of data.

## RESULTS AND DISCUSSION

In order to predict the three hydrological parameters of monthly precipitation, discharge, and evaporation, data and statistics from 1991 to 2015 were used. Of this statistical period, 17 years (1991–2007) was used for training, 4 years (2008–2011) for calibration, and 4 years (2012–2015) for the validation of the models. In the following, we evaluate the performance of each of the models used in this study to predict hydrological parameters.

### Predicting by the SVM

The modeling of monthly precipitation, discharge, and evaporation was done by MATLAB software in three steps: choosing the appropriate kernel function for model training, finding the best input pattern from time lag, and prediction. In the first step, the radial base kernel functions (RBF), linear (lin), and polynomial (poly) were used since hydrologic studies are mainly based on the RBF (Dehghani *et al.* 2015), and, in addition to the RBF kernel, linear and polynomial kernels are also the most commonly used kernel functions (Isazadeh *et al.* 2016). In the second step, the monthly data of precipitation, discharge, and evaporation with a return sequence of up to 12 months ago were used as educational data and in different combinations according to Table 3. Then, the prediction of the parameters was done using three radial, linear, and polynomial base kernels in the statistical period considered for the calibration of the models. In addition to assessing the performance of the input patterns, using the *r, RMSE*, and *SE*, the most suitable kernel function is also identified at each station and for each input pattern. Briefly, the results for the selected kernel and pattern in each of the studied stations are presented in Table 4. In the third step, the hydrological parameters were predicted using the SVM model and the appropriate input pattern as well as the selected kernel function in the previous step. The results of calculating the statistical indices in the validation step as well as the graph of the observed values and the predicted values by this model are presented in Table 4 and Figure 3.

Input pattern number | The model input pattern on monthly scale |
---|---|

1 | Q(t) = f{Q(t-1)} |

2 | Q(t) = f{Q(t-1), Q(t-2)} |

3 | Q(t) = f{Q(t-1), … , Q(t-3)} |

4 | Q(t) = f{Q(t-1), … , Q(t-4)} |

5 | Q(t) = f{Q(t-1), … , Q(t-5)} |

6 | Q(t) = f{Q(t-1), … , Q(t-6)} |

7 | Q(t) = f{Q(t-1), … , Q(t-7)} |

8 | Q(t) = f{Q(t-1), … , Q(t-8)} |

9 | Q(t) = f{Q(t-1), … , Q(t-9)} |

10 | Q(t) = f{Q(t-1), … , Q(t-10)} |

11 | Q(t) = f{Q(t-1), … , Q(t-11)} |

12 | Q(t) = f{Q(t-1), … , Q(t-12)} |

Input pattern number | The model input pattern on monthly scale |
---|---|

1 | Q(t) = f{Q(t-1)} |

2 | Q(t) = f{Q(t-1), Q(t-2)} |

3 | Q(t) = f{Q(t-1), … , Q(t-3)} |

4 | Q(t) = f{Q(t-1), … , Q(t-4)} |

5 | Q(t) = f{Q(t-1), … , Q(t-5)} |

6 | Q(t) = f{Q(t-1), … , Q(t-6)} |

7 | Q(t) = f{Q(t-1), … , Q(t-7)} |

8 | Q(t) = f{Q(t-1), … , Q(t-8)} |

9 | Q(t) = f{Q(t-1), … , Q(t-9)} |

10 | Q(t) = f{Q(t-1), … , Q(t-10)} |

11 | Q(t) = f{Q(t-1), … , Q(t-11)} |

12 | Q(t) = f{Q(t-1), … , Q(t-12)} |

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | Pattern number | Kernel | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | 12 | RBF | 0.617 | 35.798 | 1.012 | 0.601 | 26.878 | 0.898 |

Maryanaj | 12 | RBF | 0.584 | 37.291 | 0.910 | 0.640 | 28.576 | 0.767 | |

Aghajan Bolaghi | 12 | RBF | 0.534 | 39.557 | 1.090 | 0.637 | 31.980 | 0.927 | |

Hydrometric | Taghsimeab | 12 | POLY | 0.773 | 0.261 | 0.698 | 0.764 | 0.215 | 0.657 |

Sulan | 12 | POLY | 0.622 | 0.291 | 1.231 | 0.787 | 0.257 | 1.286 | |

Yalfan | 12 | RBF | 0.800 | 0.999 | 0.904 | 0.775 | 0.899 | 0.923 | |

Evaporation-gauge | Ekbatan dam | 12 | POLY | 0.980 | 22.053 | 0.169 | 0.991 | 16.412 | 0.116 |

Ghahavand | 9 | RBF | 0.973 | 38.283 | 0.229 | 0.964 | 51.002 | 0.268 | |

Kushk Abad | 12 | RBF | 0.982 | 28.044 | 0.179 | 0.981 | 36.706 | 0.216 |

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | Pattern number | Kernel | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | 12 | RBF | 0.617 | 35.798 | 1.012 | 0.601 | 26.878 | 0.898 |

Maryanaj | 12 | RBF | 0.584 | 37.291 | 0.910 | 0.640 | 28.576 | 0.767 | |

Aghajan Bolaghi | 12 | RBF | 0.534 | 39.557 | 1.090 | 0.637 | 31.980 | 0.927 | |

Hydrometric | Taghsimeab | 12 | POLY | 0.773 | 0.261 | 0.698 | 0.764 | 0.215 | 0.657 |

Sulan | 12 | POLY | 0.622 | 0.291 | 1.231 | 0.787 | 0.257 | 1.286 | |

Yalfan | 12 | RBF | 0.800 | 0.999 | 0.904 | 0.775 | 0.899 | 0.923 | |

Evaporation-gauge | Ekbatan dam | 12 | POLY | 0.980 | 22.053 | 0.169 | 0.991 | 16.412 | 0.116 |

Ghahavand | 9 | RBF | 0.973 | 38.283 | 0.229 | 0.964 | 51.002 | 0.268 | |

Kushk Abad | 12 | RBF | 0.982 | 28.044 | 0.179 | 0.981 | 36.706 | 0.216 |

### Predicting by the W-SVM

The modeling of hydrological parameters by this model was performed using MATLAB software in five steps: choosing the appropriate kernel function for the model, finding the best input pattern from time lags, selecting the most suitable decomposition level, choosing the most suitable wavelet, and prediction. In the first and second steps, the kernel functions and the selected patterns in the independent SVM model section were used at each of the studied stations. In the third step, to select the appropriate decomposition level, the wavelet type used in the wavelet transform was the simplest type of wavelet, which is the Harr wavelet, and then, the effect of the decomposition levels of 1 to 7 was investigated. In other words, the selected input pattern in step 2 was first divided into approximation and details parts by the Haar wavelet and the decomposition levels 1 to 7, and then, they were given to the SVM in order to predict. In the fourth step, regarding the more extensive use of the wavelets of Haar, Daubechies, and symlet in water sciences (Cannas *et al.* (2006)) and the fact that the Meyer wavelet has all the properties of orthogonal, biorthogonal, and all-inclusive support and thus it is able to perform all the properties of wavelets for wave processing and decomposition (Rostami *et al.* 2012), these four important wavelet groups were used in the modeling. It should be noted that the two groups of Daubechies (db (*n*)) and symlet (sym (*n*)) wavelets have a degree of 2 to 45 (*n* = 2: 45), but here only grades 2 to 6 of each of the groups were used due to the greater use in research so far. The results of selecting the appropriate decomposition level and appropriate wavelet in the third and fourth steps at the studied stations are presented in Table 5. Finally, in the last step, monthly precipitation, discharge, and evaporation at the studied stations in the statistical period that was considered for the validation of this model were predicted using the fitted model in the previous step. The results of calculating the statistical indices in the validation step as well as the graph of the observed values and the predicted values by this model are presented in Table 5 and Figure 4.

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | wavelet | decomposition level | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | Dmey | 2 | 0.92 | 16.76 | 0.47 | 1.00 | 2.06 | 0.06 |

Maryanaj | Dmey | 2 | 0.91 | 19.23 | 0.47 | 1.00 | 2.20 | 0.05 | |

Aghajan Bolaghi | Dmey | 2 | 0.91 | 18.98 | 0.52 | 1.00 | 2.36 | 0.07 | |

Hydrometric | Taghsimeab | Dmey | 2 | 0.97 | 0.10 | 0.27 | 1.00 | 0.01 | 0.03 |

Sulan | Dmey | 3 | 0.94 | 0.12 | 0.51 | 1.00 | 0.01 | 0.05 | |

Yalfan | Dmey | 2 | 0.97 | 0.49 | 0.45 | 1.00 | 0.06 | 0.06 | |

Evaporation-gauge | Ekbatan dam | Dmey | 4 | 0.99 | 14.85 | 0.11 | 1.00 | 3.75 | 0.03 |

Ghahavand | Dmey | 3 | 0.98 | 19.96 | 0.18 | 1.00 | 4.15 | 0.03 | |

Kushk Abad | Dmey | 2 | 0.99 | 21.94 | 0.14 | 1.00 | 1.78 | 0.01 |

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | wavelet | decomposition level | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | Dmey | 2 | 0.92 | 16.76 | 0.47 | 1.00 | 2.06 | 0.06 |

Maryanaj | Dmey | 2 | 0.91 | 19.23 | 0.47 | 1.00 | 2.20 | 0.05 | |

Aghajan Bolaghi | Dmey | 2 | 0.91 | 18.98 | 0.52 | 1.00 | 2.36 | 0.07 | |

Hydrometric | Taghsimeab | Dmey | 2 | 0.97 | 0.10 | 0.27 | 1.00 | 0.01 | 0.03 |

Sulan | Dmey | 3 | 0.94 | 0.12 | 0.51 | 1.00 | 0.01 | 0.05 | |

Yalfan | Dmey | 2 | 0.97 | 0.49 | 0.45 | 1.00 | 0.06 | 0.06 | |

Evaporation-gauge | Ekbatan dam | Dmey | 4 | 0.99 | 14.85 | 0.11 | 1.00 | 3.75 | 0.03 |

Ghahavand | Dmey | 3 | 0.98 | 19.96 | 0.18 | 1.00 | 4.15 | 0.03 | |

Kushk Abad | Dmey | 2 | 0.99 | 21.94 | 0.14 | 1.00 | 1.78 | 0.01 |

The results of the two calibration and validation steps indicated the high accuracy of the W-SVM in predicting precipitation, discharge, and evaporation. Concerning the random nature of precipitation regarding time and place (Toufani *et al.* 2011) and the uncertainty of this parameter, such an insignificant error and such a high correlation coefficient was unexpected and seemed unreasonable. Therefore, to fix the problem, we re-examined the prediction process using this model. Subsequent studies clearly revealed that the reason for such a problem in the prediction process was the model's use of the future months’ data in signal analysis. In fact, it is as if the total signal (all data in the statistical period) has been decomposed by the desired mother wavelet and decomposition level initially, and then the training, calibration, and validation sections have been separated. Hence, in this case, the system is biased and the prediction results are very close to reality. However, at the time of prediction, the observational data of the coming years is not available, so that the desired signal can be extracted from it. Therefore, in order to overcome this problem, the prediction algorithm was adjusted and reviewed in such a way that it is not used in the signal decomposition process at all for the precipitation, discharge, or evaporation of the month that is expected to be predicted. Here, the results changed completely, and as shown in Table 6, while using the wavelet did not contribute to the prediction of the parameters by SVM, it significantly reduced the prediction accuracy as well. Thus, despite the efforts that were made, the model of SVM combined with wavelet transform, at each stage of the prediction process, only repeated the previous month's data to predict the discharge of the next month, and this error was due to the model's inaccuracy in the training step, which is clearly visible in Figure 5.

Type | Name | Wavelet | Decomposition level | r | RMSE | SE |
---|---|---|---|---|---|---|

Rain-gauge | Ekbatan dam | Dmey | 2 | 0.19 | 42.78 | 1.43 |

Maryanaj | Dmey | 2 | 0.23 | 45.44 | 1.22 | |

Aghajan Bolaghi | Dmey | 2 | 0.22 | 51.58 | 1.50 | |

Hydrometric | Taghsimeab | Dmey | 2 | 0.69 | 0.26 | 0.79 |

Sulan | Dmey | 3 | 0.58 | 0.35 | 1.76 | |

Yalfan | Dmey | 2 | 0.66 | 1.18 | 1.21 | |

Evaporation-gauge | Ekbatan dam | Dmey | 4 | 0.85 | 67.37 | 0.48 |

Ghahavand | Dmey | 3 | 0.84 | 63.62 | 0.51 | |

Kushk Abad | Dmey | 2 | 0.85 | 66.58 | 0.48 |

Type | Name | Wavelet | Decomposition level | r | RMSE | SE |
---|---|---|---|---|---|---|

Rain-gauge | Ekbatan dam | Dmey | 2 | 0.19 | 42.78 | 1.43 |

Maryanaj | Dmey | 2 | 0.23 | 45.44 | 1.22 | |

Aghajan Bolaghi | Dmey | 2 | 0.22 | 51.58 | 1.50 | |

Hydrometric | Taghsimeab | Dmey | 2 | 0.69 | 0.26 | 0.79 |

Sulan | Dmey | 3 | 0.58 | 0.35 | 1.76 | |

Yalfan | Dmey | 2 | 0.66 | 1.18 | 1.21 | |

Evaporation-gauge | Ekbatan dam | Dmey | 4 | 0.85 | 67.37 | 0.48 |

Ghahavand | Dmey | 3 | 0.84 | 63.62 | 0.51 | |

Kushk Abad | Dmey | 2 | 0.85 | 66.58 | 0.48 |

### Predicting by the ARMAX model

The modeling of the hydrological parameters by this method was carried out using the Numxl module in the Excel software, in three steps: finding the best input pattern from time lags, finding the best model structure, and prediction. The first step was similar to the first step in the support vector model. In this step, the parameters of the ARMAX model (*p, q*) were considered as a fixed number: 1. Then, the ARMAX(1,1) training was done with the help of each of the input patterns listed in Table 5, and after predicting monthly precipitation, discharge, and evaporation in the statistical period that was considered for the calibration of the model, these input patterns were evaluated using *r, RMSE*, and *SE*. In the first step, the best input pattern from time lags was identified and, in all of these patterns, the ARMAX model (1,1) was used as the default model for training the model, so, in the second step, other parameters of the ARMAX model were also evaluated for the selected pattern. However, because of the calibration of the model's parameters following the selection of different levels by the program itself, no significant difference was found in the prediction results. The results of selecting the input pattern and the appropriate structure in the first and second steps at the studied stations are presented in Table 7. Finally, monthly values of precipitation, discharge, and evaporation at the studied stations in the statistical period considered for model validation were predicted using the fitted model in the previous step. The results of computing the statistical indices in the validation step as well as the graph of the observed values and the predicted values by the ARMAX model are given in Table 7 and Figure 6.

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | Pattern number | ARMAX(p,q) | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | 12 | ARMAX(1,1) | 0.61 | 35.95 | 1.02 | 0.59 | 27.24 | 0.91 |

Maryanaj | 12 | ARMAX(1,1) | 0.50 | 39.69 | 0.97 | 0.56 | 30.86 | 0.83 | |

Aghajan Bolaghi | 12 | ARMAX(1,1) | 0.50 | 40.33 | 1.11 | 0.55 | 28.57 | 1.00 | |

Hydrometric | Taghsimeab | 12 | ARMAX(1,1) | 0.84 | 0.25 | 0.66 | 0.70 | 0.26 | 0.79 |

Sulan | 12 | ARMAX(1,1) | 0.85 | 0.91 | 0.83 | 0.69 | 1.04 | 1.07 | |

Yalfan | 12 | ARMAX(1,1) | 0.74 | 0.26 | 1.09 | 0.52 | 0.34 | 1.71 | |

Evaporation-gauge | Ekbatan dam | 12 | ARMAX(1,1) | 0.98 | 24.82 | 0.19 | 0.99 | 21.82 | 0.15 |

Ghahavand | 9 | ARMAX(1,1) | 0.96 | 48.57 | 0.29 | 0.96 | 52.75 | 0.28 | |

Kushk Abad | 12 | ARMAX(1,1) | 0.98 | 29.53 | 0.19 | 0.97 | 37.61 | 0.22 |

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | Pattern number | ARMAX(p,q) | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | 12 | ARMAX(1,1) | 0.61 | 35.95 | 1.02 | 0.59 | 27.24 | 0.91 |

Maryanaj | 12 | ARMAX(1,1) | 0.50 | 39.69 | 0.97 | 0.56 | 30.86 | 0.83 | |

Aghajan Bolaghi | 12 | ARMAX(1,1) | 0.50 | 40.33 | 1.11 | 0.55 | 28.57 | 1.00 | |

Hydrometric | Taghsimeab | 12 | ARMAX(1,1) | 0.84 | 0.25 | 0.66 | 0.70 | 0.26 | 0.79 |

Sulan | 12 | ARMAX(1,1) | 0.85 | 0.91 | 0.83 | 0.69 | 1.04 | 1.07 | |

Yalfan | 12 | ARMAX(1,1) | 0.74 | 0.26 | 1.09 | 0.52 | 0.34 | 1.71 | |

Evaporation-gauge | Ekbatan dam | 12 | ARMAX(1,1) | 0.98 | 24.82 | 0.19 | 0.99 | 21.82 | 0.15 |

Ghahavand | 9 | ARMAX(1,1) | 0.96 | 48.57 | 0.29 | 0.96 | 52.75 | 0.28 | |

Kushk Abad | 12 | ARMAX(1,1) | 0.98 | 29.53 | 0.19 | 0.97 | 37.61 | 0.22 |

### Predicting by the ARIMA model

To run the ARIMA stochastic model, Minitab software (17) was used. In general, the process of making time series models involves a multi-stage process, including reviewing the seasonality of data, stationarity data analysis in variance, stationarity data analysis in the mean, fitting the pattern, pattern authentication, and prediction. In the following, the construction steps of the ARIMA model will be described in order to predict the precipitation, discharge, and evaporation in the studied stations. In the first step, in order to investigate the seasonal nature of the data, autocorrelation (Figure 7) and partial autocorrelation (Figure 8) charts of the data related to the hydrological parameters were drawn up at the studied stations.

After identifying the seasonal process in the data, the ARIMA or SARIMA model was used in modeling. In the second step, the Box-Cox transform was used to determine the time series stationarity of these parameters in the variance and in case any nonstationarity existed in the series, using this transformation, the series were made stationary in the variance. In the third step, the proper degree of the parameters of the ARIMA model was primarily selected by the initial conjecture through autocorrelation and partial autocorrelation charts and then trial and error was done to reach the highest correlation coefficient and the lowest root mean square error and standard error in the calibration step.

In these expressions, *M* is the number of the model parameters, is the variance of the residuals of the model, and *n* is the number of data.

The results are presented in Table 8. In the fourth step, in order to determine the accuracy of the selected pattern, autocorrelation and partial autocorrelation charts were drawn for the residuals of the model. Then, the independence and randomness of the residuals and, in other words, the correctness of the selected pattern in the previous step was proved. Finally, in the last step, monthly values of precipitation, discharge, and evaporation at the studied stations were predicted using the selected and fitted model in the previous step within the statistical period that was considered for the validation of the model. The results of calculating the statistical indices in the validation step as well as the graph of the observed values and the predicted values by the ARIMA model are presented in Table 8 and Figure 9.

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | SARIMA (p,d,q)(P,D,Q)_{12} | AIC | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | SARIMA (1,0,1)(3,0,4)_{12} | −856.95 | 0.70 | 24.94 | 0.83 | 0.66 | 25.50 | 0.85 |

Maryanaj | SARIMA (1,0,0)(3,1,3)_{12} | −777.36 | 0.66 | 31.35 | 0.84 | 0.70 | 27.29 | 0.73 | |

Aghajan Bolaghi | SARIMA (1,0,1)(3,1,3)_{12} | −781.53 | 0.63 | 29.94 | 0.93 | 0.65 | 31.91 | 0.93 | |

Hydrometric | Taghsimeab | SARIMA (2,1,3)(1,0,3)_{12} | −1,303.69 | 0.87 | 0.32 | 0.60 | 0.78 | 0.23 | 0.70 |

Sulan | SARIMA (1,1,2)(1,1,3)_{12} | −1,152.08 | 0.85 | 0.27 | 0.78 | 0.61 | 0.33 | 1.63 | |

Yalfan | SARIMA (1,1,2)(1,0,1)_{12} | −1,238.18 | 0.89 | 0.79 | 0.62 | 0.76 | 0.98 | 1.00 | |

Evaporation-gauge | Ekbatan dam | SARIMA (3,0,2)(4,1,0)_{12} | −990.49 | 0.99 | 36.66 | 0.27 | 0.99 | 20.68 | 0.15 |

Ghahavand | SARIMA (1,0,0)(0,1,4)_{12} | −985.05 | 0.95 | 59.28 | 0.37 | 0.95 | 65.36 | 0.34 | |

Kushk Abad | SARIMA (0,0,3)(0,1,4)_{12} | −955.69 | 0.97 | 64.40 | 0.46 | 0.97 | 44.31 | 0.26 |

Station | Model structure | Calibration step | Validation step | ||||||
---|---|---|---|---|---|---|---|---|---|

Type | Name | SARIMA (p,d,q)(P,D,Q)_{12} | AIC | r | RMSE | SE | r | RMSE | SE |

Rain-gauge | Ekbatan dam | SARIMA (1,0,1)(3,0,4)_{12} | −856.95 | 0.70 | 24.94 | 0.83 | 0.66 | 25.50 | 0.85 |

Maryanaj | SARIMA (1,0,0)(3,1,3)_{12} | −777.36 | 0.66 | 31.35 | 0.84 | 0.70 | 27.29 | 0.73 | |

Aghajan Bolaghi | SARIMA (1,0,1)(3,1,3)_{12} | −781.53 | 0.63 | 29.94 | 0.93 | 0.65 | 31.91 | 0.93 | |

Hydrometric | Taghsimeab | SARIMA (2,1,3)(1,0,3)_{12} | −1,303.69 | 0.87 | 0.32 | 0.60 | 0.78 | 0.23 | 0.70 |

Sulan | SARIMA (1,1,2)(1,1,3)_{12} | −1,152.08 | 0.85 | 0.27 | 0.78 | 0.61 | 0.33 | 1.63 | |

Yalfan | SARIMA (1,1,2)(1,0,1)_{12} | −1,238.18 | 0.89 | 0.79 | 0.62 | 0.76 | 0.98 | 1.00 | |

Evaporation-gauge | Ekbatan dam | SARIMA (3,0,2)(4,1,0)_{12} | −990.49 | 0.99 | 36.66 | 0.27 | 0.99 | 20.68 | 0.15 |

Ghahavand | SARIMA (1,0,0)(0,1,4)_{12} | −985.05 | 0.95 | 59.28 | 0.37 | 0.95 | 65.36 | 0.34 | |

Kushk Abad | SARIMA (0,0,3)(0,1,4)_{12} | −955.69 | 0.97 | 64.40 | 0.46 | 0.97 | 44.31 | 0.26 |

In order to select the best model for predicting each of the hydrological parameters of precipitation, discharge, and evaporation, the summary of the monthly results of each of these parameters in each station are presented in Tables 9–11. Regarding the statistical indices of *r, RMSE*, and *SE*, in Table 9, it can be said that the ARIMA model has been more accurate in predicting monthly precipitation compared to other models. According to this table, it is observed that the ARIMA model was more accurate at all stations. By comparing the statistical indices in Table 10, it can also be said that the SVM model has been more accurate in predicting the monthly discharge of all hydrometric stations. Moreover, respecting the evaluation criteria in Table 11, the SVM model is more accurate in predicting monthly evaporation than the other models. Therefore, using this model is recommended to predict monthly discharge and monthly evaporation.

Model | Ekbatan dam | Maryanaj | Aghajan Bolaghi | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.601 | 26.878 | 0.898 | 0.640 | 28.576 | 0.767 | 0.637 | 31.980 | 0.927 |

W-SVM | 0.194 | 42.775 | 1.429 | 0.225 | 45.435 | 1.220 | 0.217 | 51.578 | 1.495 |

ARMAX | 0.586 | 27.243 | 0.910 | 0.560 | 30.859 | 0.829 | 0.554 | 28.572 | 0.999 |

ARIMA | 0.664 | 25.495 | 0.852 | 0.701 | 27.289 | 0.733 | 0.648 | 31.910 | 0.925 |

Model | Ekbatan dam | Maryanaj | Aghajan Bolaghi | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.601 | 26.878 | 0.898 | 0.640 | 28.576 | 0.767 | 0.637 | 31.980 | 0.927 |

W-SVM | 0.194 | 42.775 | 1.429 | 0.225 | 45.435 | 1.220 | 0.217 | 51.578 | 1.495 |

ARMAX | 0.586 | 27.243 | 0.910 | 0.560 | 30.859 | 0.829 | 0.554 | 28.572 | 0.999 |

ARIMA | 0.664 | 25.495 | 0.852 | 0.701 | 27.289 | 0.733 | 0.648 | 31.910 | 0.925 |

Model | Taghsimeab | Sulan | Yalfan | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.764 | 0.215 | 0.657 | 0.787 | 0.257 | 1.286 | 0.775 | 0.899 | 0.923 |

W-SVM | 0.693 | 0.258 | 0.790 | 0.581 | 0.352 | 1.762 | 0.656 | 1.177 | 1.209 |

ARMAX | 0.704 | 0.258 | 0.790 | 0.517 | 0.341 | 1.706 | 0.694 | 0.341 | 1.706 |

ARIMA | 0.780 | 0.229 | 0.700 | 0.614 | 0.326 | 1.628 | 0.755 | 0.975 | 1.002 |

Model | Taghsimeab | Sulan | Yalfan | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.764 | 0.215 | 0.657 | 0.787 | 0.257 | 1.286 | 0.775 | 0.899 | 0.923 |

W-SVM | 0.693 | 0.258 | 0.790 | 0.581 | 0.352 | 1.762 | 0.656 | 1.177 | 1.209 |

ARMAX | 0.704 | 0.258 | 0.790 | 0.517 | 0.341 | 1.706 | 0.694 | 0.341 | 1.706 |

ARIMA | 0.780 | 0.229 | 0.700 | 0.614 | 0.326 | 1.628 | 0.755 | 0.975 | 1.002 |

Model | Ekbatan dam | Ghahavand | Kushk Abad | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.991 | 16.412 | 0.116 | 0.964 | 51.002 | 0.268 | 0.981 | 36.706 | 0.216 |

W-SVM | 0.851 | 67.366 | 0.476 | 0.843 | 63.618 | 0.506 | 0.850 | 66.582 | 0.482 |

ARMAX | 0.986 | 21.824 | 0.154 | 0.964 | 52.750 | 0.277 | 0.972 | 37.608 | 0.221 |

ARIMA | 0.985 | 20.682 | 0.146 | 0.953 | 65.360 | 0.344 | 0.971 | 44.310 | 0.260 |

Model | Ekbatan dam | Ghahavand | Kushk Abad | ||||||
---|---|---|---|---|---|---|---|---|---|

r | RMSE | SE | r | RMSE | SE | r | RMSE | SE | |

SVM | 0.991 | 16.412 | 0.116 | 0.964 | 51.002 | 0.268 | 0.981 | 36.706 | 0.216 |

W-SVM | 0.851 | 67.366 | 0.476 | 0.843 | 63.618 | 0.506 | 0.850 | 66.582 | 0.482 |

ARMAX | 0.986 | 21.824 | 0.154 | 0.964 | 52.750 | 0.277 | 0.972 | 37.608 | 0.221 |

ARIMA | 0.985 | 20.682 | 0.146 | 0.953 | 65.360 | 0.344 | 0.971 | 44.310 | 0.260 |

## CONCLUSION

In this research, which investigated the accuracy and efficiency of the ARIMA, SVM, ARMAX, and W-SVM models in predicting three parameters of monthly precipitation, discharge, and evaporation, the following results were obtained:

Regarding the monthly precipitation prediction, in general, the ARIMA, SVM, ARMAX, and W-SVM models are respectively ranked from first to fourth, and concerning the monthly discharge and evaporation prediction, SVM, ARIMA, ARMAX, and W-SVM models are ranked from first to fourth, respectively.

Generally, all used prediction models have been able to predict evaporation with far greater accuracy than discharge and precipitation, the reason for which can be explained by the uniformity of the evaporation process at all studied stations. Also, the results indicate that the predicted values of discharge are more accurate than those of precipitation.

Despite the acceptable results obtained from the prediction of precipitation, discharge, and evaporation using wavelet transform as a pre-processor of information in the research so far, this transformation failed to help the real-world modeling of these parameters by SVM, the reason for which has been explained in detail in the Results and discussion section.

It seems that this was the problem with a large number of the studies conducted on this issue, in which the signal analysis process was carried out simultaneously for the entire statistical period, including calibration and validation periods. Reporting negligible prediction error values and extremely high correlation between the observed and predicted values of these studies can be a reason for this claim.

Comparing the results, although the ARIMA model has a more accurate prediction of precipitation compared to the SVM model, it can be safely said that the SVM model, due to having less customizable parameters than the ARIMA model, is able to predict more easily and in less time, and thus, it is preferable to other methods.

## CONFLICT OF INTEREST

The authors declare that they have no conflict of interest.