Abstract
Due to the drought negative impacts, accurate forecasting of drought indices is important. This study focused on the short- to long-term Standardized Precipitation Index (SPI) forecasting in sites with different climates using newly integrated hybrid pre-post-processing techniques. Four sites in Iran's northwest were selected and the SPIs series with time scales of 3, 9, and 24 months were forecasted during the period of 1978–2017. For improving the modeling efficiency, wavelet transform and ensemble empirical mode decomposition (EEMD) pre-processing methods were used. In this regard, temporal features of the SPIs series were decomposed via wavelet transform (WT), then, the obtained sub-series were further broken down into intrinsic mode functions using EEMD. Also, simple linear averaging and nonlinear neural ensemble post-processing methods were applied to ensemble the outputs of hybrid models. The results showed that data pre-processing enhanced the models' capability up to 40%. Also, integrated pre-post-processing models improved the models' efficiency by approximately 50%. The root mean square errors' criteria distribution range decreased from 0.337–1.03 (in raw data) to 0.195–0.714 (in decomposed data). The results proved the capability of applied methods in modeling the SPIs series. In increasing the models' accuracy, data pre-processing was more effective than data post-processing.
HIGHLIGHTS
Merge the advantages of the pre-processing, meta-model, and post-processing techniques for short- to long-term drought forecasting.
To obtain features with higher stationary properties, subseries were further broken down via EEMD method.
Simple Linear Averaging (SLAM) and Nonlinear Neural Ensemble (NNEM) models were used for data processing.
Integrated hybrid techniques outperformed the single meta-model approaches.
Graphical Abstract
INTRODUCTION
Drought is considered as the most complex phenomenon among all extreme climate events (Wilhite 2000). According to Ghulam et al. (2007), drought is a chronic, potential natural disaster characterized by a prolonged, abnormal water shortage. However, determination of drought onset, duration, and recovery is often difficult due to the differences in hydro-meteorological variables, socioeconomic issues, and the complex nature of water demands in different areas of the world (Hayes et al. 1999). Drought occurrences cause serious problems to different parts of society, such as agriculture, energy generation, recreation, and ecosystems (Smith & Katz 2013). The conventional method for drought monitoring is using a drought index. There are various drought indices, among them, the Standardized Precipitation Index (SPI) is more commonly used. One of the SPI benefits is that it enables droughts to be described on multiple time scales (Cacciamani et al. 2007). This index is an easily interpreted and simple-moving average process (Tsakiris & Vangelis 2004). Also, the SPI characteristics are constant from site to site and its calculations are only based on precipitation data (Komasi et al. 2018).
According to Morid et al. (2007), continuous drought monitoring provides the necessary information for drought preparedness. The accuracy of this information relies on the efficiency of applied models. So far, numerous models have been developed for drought assessing, such as time series, regression, artificial intelligence, and physical models. Linear ARIMA and direct multi-step neural network (DMSNN) models were applied by Mishra et al. (2007) for drought detection. Madadgar & Moradkhani (2013) used a Bayesian-based model for predicting hydrologic droughts in the upper Colorado River Basin. Mehr et al. (2014) applied wavelet-linear genetic programming (WLGP) approach in forecasting long lead-time drought in the state of Texas. Ma et al. (2016) used the variable infiltration capacity (VIC) hydrologic model-based Palmer drought scheme and compared the efficiency of multiple Palmer indices (PIs) for drought detection. Surendran et al. (2019) studied droughts of semi-arid and arid parts of India via DrinC model by considering various drought indices and bivariate drought frequency analysis. The mentioned models led to promising results in drought forecasting; however, since drought accurate forecast can be an effective tool for mitigating some of the more adverse consequences of drought, it is necessary to use other methods with higher efficiency, such as data-driven methods. These models are suitable forecasting tools due to their rapid development times, as well as minimal information requirements compared to the information required for physically based models.
In recent years, the meta-model techniques such as artificial neural networks (ANNs), neuro-fuzzy models (NF), genetic programming (GP), gene expression programming (GEP), multivariate adaptive regression splines (MARS), support vector machine (SVM), and Gaussian process regression (GPR) have been applied in investigating the hydraulic and hydrologic complex phenomena (Hipni et al. 2013) such as real-time hydrologic forecasting (Yu et al. 2004; Yu & Liong 2007), prediction of groundwater quality (Yang et al. 2017), atmospheric temperature modeling (Azamathulla et al. 2018), and forecasting monthly and seasonal streamflow (Zhu et al. 2018). On the other hand, hybrid models based on signal decomposition can be effective in increasing the time series prediction methods efficiency (Pachori et al. 2015). Wavelet analysis is one of the common methods for signal decomposition. Wavelet transform (WT) as a signal pre-processing method provides useful information in the temporal and frequency domains for non-stationary signals. Besides the WT, the empirical mode decomposition (EMD) method has been used recently for signal decomposition. This method is suitable for nonlinear and non-stationary time series (Huang et al. 1998). Unlike wavelet decomposition, EMD extracts the data oscillatory mode components without a priori determining the basis functions or level of decomposition (Labate et al. 2013).
According to Partalas et al. (2008), Cloke & Pappenberger (2009), and Nourani et al. (2018), there is no unique model that is superior to others in all cases and the performances of different models may be different according to the condition of each intended parameter. Therefore, it is verified that the combination of outputs (from different models) through an ensemble method may lead to more accurate results. Integrating different models using an ensemble model as a post-processing method will represent different aspects of the underlying patterns more accurately (Zhang 2003). Chen et al. (2012) introduced random forest (RF) statistical method for predicting drought events together with an appropriate estimation on uncertainty measures. They showed that the confidence intervals derived from the RF model generally had good coverage. Park & Kim (2019) used satellite image and topography data for severe drought prediction based on the RF model. Li et al. (2020) used penalized linear regression (PLR) and ensemble methods for Standardized Precipitation Evapotranspiration Index (SPEI) forecasting in the northeast of China. The results showed that both methods provided desirable prediction results.
In this study, three steps were considered for Standardized Precipitation Drought Index (SPI) forecasting at four sites with different climates in the northwest of Iran. In the first step, the capability of several meta-models (i.e., SVM, GPR, ANN, and GEP) was assessed in drought forecasting. The SPI can be calculated in different time scales (e.g., 1, 3, 6, …). Different time scales reflect the impact of drought on the availability of different types of water resources. Shorter time scales of SPIs can provide early warning of drought and help assess drought severity. In this study, the capability of applied methods was assessed for short- to long-term drought modeling. Therefore, the SPI with 3, 9, and 24 months' time scales were used (SPI 3, SPI 9, and SPI 24) for short-term, mid-term, and long-term analyses. The SPI 3 reflects short- and medium-term moisture conditions and provides a seasonal estimation of precipitation. The SPI 9 provides an indication of inter-seasonal precipitation patterns over a medium time scale duration. This time period begins to bridge a short-term seasonal drought to those longer-term droughts that may become hydrological, or multi-year, in nature. The SPI 24 is usually tied to streamflows, reservoir levels, and even groundwater levels at longer time scales.
In the second step, the data pre-processing impact on improving the models' efficiency was investigated. In this regard, temporal features of the time series were decomposed via discrete wavelet transform (DWT) and further broken down via ensemble empirical mode decomposition (EEMD) to obtain features with higher stationary properties. The time series decomposition to various periodicity scales with WT and further decomposition of sub-series using EEMD can lead to more stationary sub-series. Then, the sub-series energy was calculated and the sub-series with higher energy were used as inputs in meta-model approaches to produce a multi-scale model for forecasting the SPI time series.
In the third step, the impacts of linear and nonlinear ensemble methods as post-processing approaches on improving the models' efficiency were assessed. For this aim, a simple linear averaging (SLAM) model was used as linear ensemble method and nonlinear neural ensemble (NNEM) model was used as nonlinear ensemble method. Among SVM, GPR, ANN, and GEP models, ANN method was used for nonlinear ensembling. In fact, the study focuses on improving the forecasting efficiency of the short- to long-term SPIs in regions with different climates by ensembling the results of meta-models using two linear and nonlinear post-processing approaches and by further time series decomposition via EEMD.
MATERIALS AND METHODS
Study area
In the current study, the northwest of Iran with an area of approximately 100,500 km2 was selected to predict the SPIs time series. The mean annual precipitation in the area varies between 198 and 845 mm. The greatest amount of precipitation depth is observed over the mountainous areas and particularly the eastern region that receives more rainfall due to the Caspian Sea effect. The available rain gauge stations located in the studied area have different statistical periods and some of them have missing data. In the modeling process, the stations with at least 30–40 years of data were selected. Therefore, for Makoo, Urmia, and Ardabil cities only one synoptic station data were used, while for Tabriz city, the average values of three synoptic stations with the same statistical period were used. These stations are among the official sites maintained by the Iranian Meteorological Organization, and all necessary data can be accessed via http://www.weather.ir. The selected stations mostly have 40-year records and represent a very good spatial distribution over the selected regions. After obtaining precipitation data during the period of 1978–2017 from the Meteorological Organization of Iran, these data were checked for accuracy and homogeneity, and the data homogeneity was confirmed. The SPI can be analyzed at different temporal scales (e.g., 1, 3, 6, …) according to users' need to monitor different types of drought including meteorological, agricultural, and hydrological drought. In this study, different drought scales ranging from short to long term were used for the SPI modeling. In this regard, the SPI with 3, 9, and 24 months time scales were used for short-term, mid-term, and long-term analyses, respectively. To compare the performance of applied pre-post-processing models, the total data were divided into three sets: training, validation, and testing sets. The first 70% of the whole data was used for training the models and the last 30% of data was used for validating and testing the models (15% for validating and 15% for testing). The training set trains the scheme on the basis of a minimization criterion and the validation set is used as a stopping criterion for training to avoid overfitting to the data. The testing set is used to evaluate the generated model and assess its generalization capability (Kitsikoudis et al. 2015). Figure 1 shows the selected stations climatic classification by Köppen-Geiger climate classification criteria and their monthly average precipitation.
The standardized precipitation index
The impacts of a water deficit are a complex function of water source and water use. The time scale over which precipitation deficits accumulate becomes extremely important and separates different types of drought. As described by McKee et al. (1993), some of the practical issues that are important in any analysis of drought are time scale, probability (the expected frequency of an event), and precipitation deficit. Addressing these issues, they developed a SPI, where the standardization is based on probability. The SPI quantifies observed precipitation as a standardized departure from a selected probability distribution function that models the raw precipitation data. The raw precipitation data are typically fitted to a gamma or a Pearson type III distribution, and then transformed to a normal distribution. The SPI values can be interpreted as the number of standard deviations by which the observed anomaly deviates from the long-term mean.
The SPI index has statistical consistency and gives a representation of abnormal wetness and dryness. The SPI was designed to quantify the precipitation deficit for multiple time scales and it is capable of showing both short- and long-term drought effects in variable time scales of precipitation anomalies. The use of different time scales allows the effects of a precipitation deficit on different water resources components (groundwater, reservoir storage, soil moisture, streamflow) to be investigated (Mirabbasi et al. 2013). For example, soil moisture conditions respond to precipitation anomalies on a relatively short scale. Groundwater, streamflow, and reservoir storage reflect the longer-term precipitation anomalies. The SPI values can be positive or negative. Positive SPI values show wet conditions and negative SPI values show dry conditions (for more details, see Tsakiris & Vangelis (2004)). The SPI descriptive features are listed in Table 1.
SPI values . | Class . | SPI values . | Class . |
---|---|---|---|
> 2 | Extremely wet | − 1 to − 1.49 | Moderately dry |
1.5–1.99 | Very wet | − 1.5 to − 1.99 | Very dry |
1.0–1.49 | Moderately wet | <−2 | Extremely dry |
− 0.99 to 0.99 | Near normal |
SPI values . | Class . | SPI values . | Class . |
---|---|---|---|
> 2 | Extremely wet | − 1 to − 1.49 | Moderately dry |
1.5–1.99 | Very wet | − 1.5 to − 1.99 | Very dry |
1.0–1.49 | Moderately wet | <−2 | Extremely dry |
− 0.99 to 0.99 | Near normal |
Pre-processing approaches
One of the most popular approaches in time series processing is the WT (Farajzadeh & Alizadeh 2017). The WT uses flexible window function (mother wavelet) in signal processing. The flexible window function can be changed over time according to the signal shape and compactness (Mehr et al. 2013). After using WT, the signal will decompose into two approximation (large-scale or low-frequency component) and detailed (small-scale component) components. The other approach for time series processing is EMD. By applying the EMD method, each signal can be decomposed into a number of inherent mode functions (IMFs) which can be used to process nonlinear and non-stationary signals. One of the advantages of this method is the ability to determine the instantaneous frequency of the signal. At each step of the signal decomposition into its frequency components, the high frequency components are separated first and this process must continue until the component with the lowest frequency remains. EEMD is developed based on EMD. In fact, using the noise-added data analysis method, the EEMD solves the EMD mode mixing issue (Wu & Huang 2009). The EEMD algorithm can be described as: (1) for a given signal x(t), random white noise is added to the signal; (2) the noise-added signal is decomposed using EMD for obtaining IMF series; (3) steps 1 and 2 are repeated until the number of added white noises is greater than or equal to the number of trials; (4) for obtaining the ensemble IMF, the average of the sum of all IMFs is computed (Ij(t)); and (5) the original signal is formed as x(t) = ∑nj=1 Ij(t). For selecting the most effective IMFs and involving them in the modeling process, their energy values can be calculated and the IMFs with higher energy can be used as inputs (see Lei et al. (2009) for more details).
Artificial intelligence techniques
Feed forward neural network (FFNN)
The aim of ANN as a meta-model approach is to achieve a nonlinear relationship between inputs and output data series (Najafi et al. 2018). Among the other ANN algorithms, FFNN has been widely used in water resources engineering issues. FFNN is the most common algorithm of ANNs, with having three layers of input, hidden, and target (for more details see Tayfur (2012)).
Gene expression programming (GEP)
GEP is an extension to GP which was developed by Ferreria (2001). GEP evolves computer programs which are complex tree structures and can learn and adapt to change in size, shape, and composition (Bhattacharjya 2012). The computer programs of GEP are encoded in simple linear chromosomes of fixed length. GEP as a heuristic search and optimization technique allows the evolution of more complex programs and phenomena. Such a model is capable of adapting itself to predict any variable of interest via sufficient inputs. One strength of the GEP approach is its unique, multigenic nature, which allows the evolution of more complex programs composed of several subprograms. Modeling a phenomenon using the GEP algorithm requires the selection of five elements that include the function set, terminal set, fitness function, control parameters, and stopping condition (for more details see Ferreria (2001)). However, for obtaining the best results, the architecture of the chromosomes including number of chromosomes, head size, number of genes, and the set of genetic operators (recombination, mutation, transposition, and crossover) should be selected appropriately. Also, according to Oltean & Grosan (2003), GEP has some weaknesses as follows:
The number of genes in the GEP multigenic chromosome raises a problem. The success rate of GEP can increase with the number of genes in the chromosome; however, after a certain value, the success rate decreases if the number of genes in the chromosome is increased.
A large part of the chromosome is unused if the target expression is short and the head length is large.
Kernel-based methods
Kernel-based (KB) approaches are new methods which are used for classification and regression purposes (Roushangar et al. 2019). Two important kernel-based approaches are Gaussian process regression (GPR) and support vector machine (SVM) which work based on the different kernels type such as linear, polynomial, radial basis function (RBF), and sigmoid functions in SVM and polynomial, normalized polynomial, RBF, and Pearson in GPR. Kernel-based approaches are based on statistical learning theory initiated and can be used for modeling the complex and nonlinear phenomenon. The kernel type affects the training and classification precision, therefore, the most important step in kernel-based approaches is the appropriate selection of kernel type. These methods are memory intensive, trickier to tune due to the importance of picking the right kernel (Babovic 2009). The aim of the KB approaches is to determine a function which has the most deviation from the actual target vectors for all given training data (Smola 1996; Roushangar et al. 2020).
Post-processing approach
Performance criteria
A brief description of the proposed methodology in the study
Selection of appropriate variables as inputs is the most important step in modeling via intelligence methods. In this research, according to Mokhtarzad et al. (2017), Xu et al. (2018), Zhang et al. (2019), Li et al. (2020), and Khan et al. (2020), previous values of the SPIs time series, monthly precipitation (P), average temperature (T), and relative humidity (R) parameters were used as inputs to forecast the next month SPI values. The WT and EEMD were applied to decompose the SPI time series. DWT is a very useful technique for time series data processing of many aspects (Ercelebi 2004). DWT is suitable for analyzing time series data since it is an effective method for time series data reduction. DWT can detect sudden signal changes well, because it transforms original time series data into two types of wavelet coefficients: approximation and detail. Approximation wavelet (low frequency) coefficients capture rough features that estimate the original data, while detail wavelet (high frequency) coefficients capture detail features that describe frequent movements of the data. DWT is useful in supporting multiresolution analysis. In addition to projecting a time series into approximation and detail wavelet coefficients, DWT decomposes these coefficients into various scales (Shahabi et al. 2000; Chouakri et al. 2005; Chaovalit et al. 2011).
RESULTS AND DISCUSSION
The SPI drought index with 3, 9, and 24 months time scales were calculated for four stations located in northwest Iran during 1978–2017. In Figure 3, the SPIs series with 3 months' time scale were shown for all stations. Also, the yearly SPI was calculated for just showing the drought duration. The results showed that in the periods of 1983–1985, 1988–1991, 1995–2001, 2005–2010, 2011–2013, and 2017 different levels of drought occurred in four regions, but most of them were near normal and slight droughts (−1 < SPI ≤ 0). Severe droughts could be seen in several years, such as 1990 in Tabriz and 2005 in Urmia. The results indicated that in the selected area and for the considered period, the slight droughts, moderate droughts, and extreme droughts increased from east to west. For example, based on Figure 3(b), in the Makoo station, the frequency of droughts was the highest among all stations (i.e., approximately 7, 18, and 23% more than in the Urmia, Tabriz, and Ardabil stations, respectively).
Meta-model approaches results
In this section of the paper, the efficiency of meta-model methods were assessed in forecasting the SPI drought index using the main data (i.e., without any pre- or post-processing). It should be noted that each artificial intelligence method has its own parameters; for achieving the desired results, and the optimized amount of these parameters should be determined. For example, in designing the SVM and GPR approaches, the selection of appropriate type of kernel function is needed. There are various kernel functions which can be used based on the nature of the studied phenomenon. However, most studies indicate that the RBF kernels result in better prediction in different hydrological issues (Gill et al. 2006). Therefore, in this research, the RBF kernel function was used for developing models.
Due to direct impacts of the network topology, in ANN modeling, on its computational complexity and generalization capability, the appropriate structure of ANN should be selected. In this study, various networks were tried for determining the hidden layer node numbers. For selecting the optimal network, several networks were tested in which the number of neurons in the hidden layer varied from 2 to 9 (2, 3, 5, 7, and 9). The tangent sigmoid and pure linear functions were found to be appropriate for the hidden and output node activation functions, respectively. Also, the model training was done using the scheme of back propagation approach. When the acceptable level of error was obtained, the training of the ANN models was stopped. Furthermore, in GEP model, basic arithmetic operators ( + , − , × , /) and several mathematical functions (X2, X3, √) were used as GEP function set. Different values for chromosome number (25-30-35), head size (8–7), and number of genes (4–3) were considered and evaluated. Several models were run with different values of the mentioned parameters and stopped when no significant changes in correlation coefficients were observed. It was found that the model with 25 chromosomes, 7 head sizes, and 3 genes yielded better results. GEP chromosomes usually are composed of one or more genes. Each gene codes for a sub-expression tree (ET) and the sub-ETs interact with one another forming a more complex multisub-unit ET. One of the simplest interactions is the linking of sub-ETs by a particular function. In this study, for linking the sub-ETs, the addition operator was used.
The obtained results from single models are listed in Table 2. As can be seen, the used models showed slightly similar results in the selected stations. Among the used models, the GPR model yielded the better predictions and the FFNN model was the second most accurate model. The GEP was the least accurate model. Considering the obtained statistical indicators, it could be stated that forecasting results deteriorated as the SPI degree reduced and the FFNN, SVM, GPR, and GEP models did not lead to acceptable efficiency in modeling of the SPI 3 in the studied area. The applied models were more successful in modeling the long-term SPI (i.e., SPI 24). In general, the accuracy of prediction increased with increasing SPI degree. This is because SPI 24 reflects the precipitation data normalization over the last 24 months, so, it is not as sensitive to variations in data from one month to the next as the shorter orders of SPI. Figure 4 illustrates the scatter plots between observed vs predicted SPI 3 drought index for the GPR models.
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
FFNN | 0.460 | 0.854 | −18.340 | 0.430 | 0.949 | −9.026 | 0.542 | 0.684 | −27.247 | 0.511 | 0.722 | −23.651 |
SVM | 0.448 | 0.871 | −17.923 | 0.431 | 0.941 | −10.844 | 0.520 | 0.699 | −24.891 | 0.498 | 0.762 | −20.250 |
GPR | 0.517 | 0.801 | −23.548 | 0.498 | 0.820 | −14.307 | 0.609 | 0.631 | −38.439 | 0.554 | 0.672 | −29.126 |
GEP | 0.431 | 0.942 | −11.580 | 0.418 | 0.976 | −5.0680 | 0.477 | 0.779 | −19.178 | 0.468 | 0.896 | −13.154 |
Ardabil | Makoo | |||||||||||
FFNN | 0.449 | 0.899 | −16.210 | 0.450 | 0.911 | −7.223 | 0.526 | 0.928 | −28.124 | 0.516 | 0.944 | −16.071 |
SVM | 0.438 | 0.931 | −11.840 | 0.413 | 0.951 | −5.276 | 0.514 | 0.946 | −22.140 | 0.528 | 0.925 | −12.651 |
GPR | 0.508 | 0.864 | −18.799 | 0.481 | 0.872 | −8.377 | 0.608 | 0.873 | −33.617 | 0.598 | 0.901 | −19.325 |
GEP | 0.428 | 0.948 | −9.0451 | 0.409 | 0.994 | −3.031 | 0.531 | 0.922 | −29.171 | 0.501 | 1.032 | −10.887 |
SPI 9 | Tabriz | Urmia | ||||||||||
FFNN | 0.702 | 0.722 | −21.274 | 0.702 | 0.797 | −13.185 | 0.731 | 0.570 | −31.607 | 0.715 | 0.609 | −31.929 |
SVM | 0.684 | 0.731 | −20.791 | 0.687 | 0.789 | −18.257 | 0.713 | 0.613 | −28.873 | 0.672 | 0.651 | −27.338 |
GPR | 0.789 | 0.677 | −29.316 | 0.756 | 0.689 | −24.314 | 0.852 | 0.501 | −44.589 | 0.765 | 0.564 | −39.320 |
GEP | 0.683 | 0.739 | −13.433 | 0.655 | 0.824 | −10.842 | 0.687 | 0.609 | −22.246 | 0.631 | 0.807 | −17.758 |
Ardabil | Makoo | |||||||||||
FFNN | 0.778 | 0.756 | −25.749 | 0.721 | 0.761 | −10.185 | 0.718 | 0.782 | −35.999 | 0.686 | 0.794 | −22.660 |
SVM | 0.761 | 0.785 | −19.155 | 0.701 | 0.801 | −7.4390 | 0.711 | 0.792 | −28.339 | 0.659 | 0.776 | −17.839 |
GPR | 0.852 | 0.684 | −29.063 | 0.758 | 0.728 | −12.812 | 0.822 | 0.728 | −43.030 | 0.733 | 0.749 | −27.249 |
GEP | 0.720 | 0.796 | −16.578 | 0.618 | 0.841 | −5.2730 | 0.699 | 0.774 | −27.339 | 0.649 | 0.872 | −15.351 |
SPI 24 | Tabriz | Urmia | ||||||||||
FFNN | 0.800 | 0.513 | −25.742 | 0.772 | 0.601 | −20.458 | 0.848 | 0.402 | −36.664 | 0.822 | 0.433 | −34.483 |
SVM | 0.779 | 0.517 | −24.157 | 0.756 | 0.586 | −19.763 | 0.833 | 0.454 | −33.493 | 0.814 | 0.483 | −29.525 |
GPR | 0.899 | 0.476 | −33.052 | 0.832 | 0.491 | −30.154 | 0.917 | 0.337 | −48.211 | 0.901 | 0.351 | −42.466 |
GEP | 0.778 | 0.511 | −16.254 | 0.720 | 0.627 | −16.422 | 0.808 | 0.438 | −30.214 | 0.799 | 0.612 | −21.309 |
Ardabil | Makoo | |||||||||||
FFNN | 0.840 | 0.526 | −32.803 | 0.808 | 0.537 | −24.311 | 0.874 | 0.529 | −42.478 | 0.831 | 0.513 | −29.684 |
SVM | 0.822 | 0.577 | −28.308 | 0.785 | 0.585 | −22.015 | 0.823 | 0.533 | −33.440 | 0.809 | 0.527 | −23.368 |
GPR | 0.884 | 0.480 | −38.244 | 0.849 | 0.516 | −30.315 | 0.911 | 0.485 | −50.776 | 0.883 | 0.512 | −35.696 |
GEP | 0.805 | 0.585 | −23.281 | 0.729 | 0.628 | −11.051 | 0.839 | 0.525 | −32.260 | 0.817 | 0.616 | −20.109 |
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
FFNN | 0.460 | 0.854 | −18.340 | 0.430 | 0.949 | −9.026 | 0.542 | 0.684 | −27.247 | 0.511 | 0.722 | −23.651 |
SVM | 0.448 | 0.871 | −17.923 | 0.431 | 0.941 | −10.844 | 0.520 | 0.699 | −24.891 | 0.498 | 0.762 | −20.250 |
GPR | 0.517 | 0.801 | −23.548 | 0.498 | 0.820 | −14.307 | 0.609 | 0.631 | −38.439 | 0.554 | 0.672 | −29.126 |
GEP | 0.431 | 0.942 | −11.580 | 0.418 | 0.976 | −5.0680 | 0.477 | 0.779 | −19.178 | 0.468 | 0.896 | −13.154 |
Ardabil | Makoo | |||||||||||
FFNN | 0.449 | 0.899 | −16.210 | 0.450 | 0.911 | −7.223 | 0.526 | 0.928 | −28.124 | 0.516 | 0.944 | −16.071 |
SVM | 0.438 | 0.931 | −11.840 | 0.413 | 0.951 | −5.276 | 0.514 | 0.946 | −22.140 | 0.528 | 0.925 | −12.651 |
GPR | 0.508 | 0.864 | −18.799 | 0.481 | 0.872 | −8.377 | 0.608 | 0.873 | −33.617 | 0.598 | 0.901 | −19.325 |
GEP | 0.428 | 0.948 | −9.0451 | 0.409 | 0.994 | −3.031 | 0.531 | 0.922 | −29.171 | 0.501 | 1.032 | −10.887 |
SPI 9 | Tabriz | Urmia | ||||||||||
FFNN | 0.702 | 0.722 | −21.274 | 0.702 | 0.797 | −13.185 | 0.731 | 0.570 | −31.607 | 0.715 | 0.609 | −31.929 |
SVM | 0.684 | 0.731 | −20.791 | 0.687 | 0.789 | −18.257 | 0.713 | 0.613 | −28.873 | 0.672 | 0.651 | −27.338 |
GPR | 0.789 | 0.677 | −29.316 | 0.756 | 0.689 | −24.314 | 0.852 | 0.501 | −44.589 | 0.765 | 0.564 | −39.320 |
GEP | 0.683 | 0.739 | −13.433 | 0.655 | 0.824 | −10.842 | 0.687 | 0.609 | −22.246 | 0.631 | 0.807 | −17.758 |
Ardabil | Makoo | |||||||||||
FFNN | 0.778 | 0.756 | −25.749 | 0.721 | 0.761 | −10.185 | 0.718 | 0.782 | −35.999 | 0.686 | 0.794 | −22.660 |
SVM | 0.761 | 0.785 | −19.155 | 0.701 | 0.801 | −7.4390 | 0.711 | 0.792 | −28.339 | 0.659 | 0.776 | −17.839 |
GPR | 0.852 | 0.684 | −29.063 | 0.758 | 0.728 | −12.812 | 0.822 | 0.728 | −43.030 | 0.733 | 0.749 | −27.249 |
GEP | 0.720 | 0.796 | −16.578 | 0.618 | 0.841 | −5.2730 | 0.699 | 0.774 | −27.339 | 0.649 | 0.872 | −15.351 |
SPI 24 | Tabriz | Urmia | ||||||||||
FFNN | 0.800 | 0.513 | −25.742 | 0.772 | 0.601 | −20.458 | 0.848 | 0.402 | −36.664 | 0.822 | 0.433 | −34.483 |
SVM | 0.779 | 0.517 | −24.157 | 0.756 | 0.586 | −19.763 | 0.833 | 0.454 | −33.493 | 0.814 | 0.483 | −29.525 |
GPR | 0.899 | 0.476 | −33.052 | 0.832 | 0.491 | −30.154 | 0.917 | 0.337 | −48.211 | 0.901 | 0.351 | −42.466 |
GEP | 0.778 | 0.511 | −16.254 | 0.720 | 0.627 | −16.422 | 0.808 | 0.438 | −30.214 | 0.799 | 0.612 | −21.309 |
Ardabil | Makoo | |||||||||||
FFNN | 0.840 | 0.526 | −32.803 | 0.808 | 0.537 | −24.311 | 0.874 | 0.529 | −42.478 | 0.831 | 0.513 | −29.684 |
SVM | 0.822 | 0.577 | −28.308 | 0.785 | 0.585 | −22.015 | 0.823 | 0.533 | −33.440 | 0.809 | 0.527 | −23.368 |
GPR | 0.884 | 0.480 | −38.244 | 0.849 | 0.516 | −30.315 | 0.911 | 0.485 | −50.776 | 0.883 | 0.512 | −35.696 |
GEP | 0.805 | 0.585 | −23.281 | 0.729 | 0.628 | −11.051 | 0.839 | 0.525 | −32.260 | 0.817 | 0.616 | −20.109 |
Investigating the impact of data pre-post-processing on models' efficiency
In this section, the effect of pre- and post-processing of time series on increasing of models' accuracy was investigated. Therefore, the time series were decomposed using WT method. To decompose the time series by WT, a mother wavelet which is more similar to the signal should be selected. In this study, the daubechies (db2 and db4) and symlet (sym2 and sym4) mother wavelets were trained and it was found that the db4 mother wavelet led to better outcomes. Therefore, db4 mother wavelet and decomposition level 3 were used for time series decomposition. Four sub-series (one approximation and three detailed) were obtained for each time series. In the second step, two first detail sub-series (i.e., detail 1 and 2) were further decomposed via EEMD. The principle of EEMD is decomposition of signal to different IMFs and one residual signal. The sum of these signals will be the same original signal. The formation of IMFs is based on subtracting the basic function from the original signal. This process continues until the residual signal remains almost constant. In this study, details 1 and 2 sub-series were decomposed into 9 and 7 IMFs and one residual signal, respectively. Further decomposition using EEMD results in more stationary and with less noise sub-series.
Since the number of input data increased after time series decomposition, the energy of sub-series was calculated and the sub-series with higher energy were selected as inputs. Then, the selected sub-series were used as inputs in meta-model methods to predict the SPI index. In Figure 5, the SPI(t) 3 decomposed sub-series by db4-EEMD is shown for the Tabriz station. According to this figure, the IMFs 3 and 6 from detail 1 and the IMFs 3 and 4 from detail 2 were selected as appropriate IMFs based on their higher energy values in comparison with the other IMFs. The results of the integrated pre-processing models are listed in Table 3 and shown in Figure 6. According to the results presented in Tables 2 and 3, it could be induced that data pre-processing significantly improved the results' accuracy and integrated models were more accurate than single meta-model approaches. In fact, the use of WT and further decomposition of the detailed series led to an improvement in the outcomes. For example, in Urmia station, the RMSE values of the FFNN, SVM, GPR, and GEP models for the SPI 3 test series were 0.722, 0.762, 0.672, and 0.896, respectively; the RMSE of the WT-EEMD-FFNN, WT-EEMD-SVM, WT-EEMD-GPR, and WT-EEMD-GEP models decreased to 0.547, 0.556, 0.448, and 0.601, respectively. It could be seen that using the integrated models, the SPIs' modeling were done with higher accuracy and the applied methods were successful in short- to long-term drought forecasting in the selected area with different climates. From the results, the WT-EEMD-GPR method was more efficient in the modeling process. In general, the data pre-processing increased the models' accuracy from 30% to 40% in validation sets and from 30% to 45% in testing sets. In Figure 6, the scatter plots between observed vs predicted SPIs series of the WT-EEMD-GPR model for Tabriz station are shown. It could be seen that using the integrated model, the maximum and minimum points of the SPIs data were well modeled.
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.838 | 0.584 | −40.854 | 0.752 | 0.624 | −28.884 | 0.859 | 0.441 | −80.192 | 0.832 | 0.547 | −63.683 |
WT-EEMD-SVM | 0.837 | 0.588 | −38.561 | 0.751 | 0.620 | −34.700 | 0.858 | 0.442 | −79.651 | 0.831 | 0.556 | −60.800 |
WT-EEMD-GPR | 0.895 | 0.542 | −50.999 | 0.871 | 0.573 | −45.782 | 0.879 | 0.406 | −98.005 | 0.855 | 0.448 | −77.204 |
WT-EEMD-GEP | 0.848 | 0.597 | −42.898 | 0.725 | 0.671 | −19.217 | 0.855 | 0.519 | −76.368 | 0.784 | 0.601 | −44.093 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.817 | 0.640 | −51.872 | 0.778 | 0.646 | −23.115 | 0.862 | 0.609 | −84.917 | 0.816 | 0.624 | −50.427 |
WT-EEMD-SVM | 0.816 | 0.640 | −37.888 | 0.777 | 0.679 | −16.883 | 0.851 | 0.646 | −79.140 | 0.808 | 0.651 | −45.485 |
WT-EEMD-GPR | 0.856 | 0.561 | −56.398 | 0.840 | 0.590 | −26.807 | 0.865 | 0.582 | −89.842 | 0.824 | 0.607 | −65.841 |
WT-EEMD-GEP | 0.792 | 0.673 | −28.944 | 0.749 | 0.691 | −9.698 | 0.814 | 0.673 | −51.286 | 0.756 | 0.714 | −34.838 |
SPI 9 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.880 | 0.422 | −44.761 | 0.846 | 0.503 | −31.195 | 0.923 | 0.380 | −101.38 | 0.864 | 0.448 | −88.403 |
WT-EEMD-SVM | 0.872 | 0.431 | −45.561 | 0.854 | 0.499 | −39.058 | 0.878 | 0.389 | −99.606 | 0.860 | 0.457 | −81.984 |
WT-EEMD-GPR | 0.939 | 0.378 | −64.743 | 0.927 | 0.408 | −50.445 | 0.950 | 0.337 | −124.58 | 0.902 | 0.385 | −100.66 |
WT-EEMD-GEP | 0.856 | 0.440 | −38.313 | 0.838 | 0.536 | −25.321 | 0.858 | 0.423 | −75.170 | 0.829 | 0.510 | −51.460 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.892 | 0.527 | −54.457 | 0.873 | 0.579 | −26.074 | 0.869 | 0.564 | −91.277 | 0.868 | 0.569 | −88.403 |
WT-EEMD-SVM | 0.873 | 0.549 | −39.585 | 0.845 | 0.612 | −19.044 | 0.862 | 0.601 | −85.281 | 0.856 | 0.615 | −81.667 |
WT-EEMD-GPR | 0.935 | 0.482 | −59.676 | 0.909 | 0.510 | −30.238 | 0.917 | 0.491 | −101.71 | 0.882 | 0.508 | −95.757 |
WT-EEMD-GEP | 0.855 | 0.606 | −33.712 | 0.795 | 0.645 | −14.939 | 0.834 | 0.655 | −71.802 | 0.799 | 0.687 | −40.298 |
SPI 24 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.913 | 0.335 | −89.254 | 0.891 | 0.396 | −59.850 | 0.930 | 0.282 | −105.24 | 0.909 | 0.291 | −110.25 |
WT-EEMD-SVM | 0.890 | 0.357 | −63.241 | 0.882 | 0.388 | −58.489 | 0.944 | 0.303 | −120.87 | 0.903 | 0.331 | −100.47 |
WT-EEMD-GPR | 0.959 | 0.298 | −96.493 | 0.943 | 0.322 | −76.846 | 0.959 | 0.227 | −138.09 | 0.934 | 0.265 | −115.89 |
WT-EEMD-GEP | 0.888 | 0.364 | −52.550 | 0.862 | 0.403 | −37.790 | 0.953 | 0.289 | −125.24 | 0.901 | 0.344 | −89.190 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.905 | 0.373 | −89.254 | 0.888 | 0.415 | −64.643 | 0.930 | 0.379 | −132.34 | 0.913 | 0.430 | −111.24 |
WT-EEMD-SVM | 0.897 | 0.400 | −64.048 | 0.889 | 0.442 | −47.216 | 0.925 | 0.386 | −128.47 | 0.904 | 0.453 | −108.87 |
WT-EEMD-GPR | 0.956 | 0.347 | −84.208 | 0.922 | 0.388 | −74.968 | 0.941 | 0.356 | −145.89 | 0.921 | 0.386 | −117.10 |
WT-EEMD-GEP | 0.886 | 0.418 | −35.363 | 0.880 | 0.442 | −42.503 | 0.895 | 0.409 | −101.19 | 0.885 | 0.493 | −93.248 |
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.838 | 0.584 | −40.854 | 0.752 | 0.624 | −28.884 | 0.859 | 0.441 | −80.192 | 0.832 | 0.547 | −63.683 |
WT-EEMD-SVM | 0.837 | 0.588 | −38.561 | 0.751 | 0.620 | −34.700 | 0.858 | 0.442 | −79.651 | 0.831 | 0.556 | −60.800 |
WT-EEMD-GPR | 0.895 | 0.542 | −50.999 | 0.871 | 0.573 | −45.782 | 0.879 | 0.406 | −98.005 | 0.855 | 0.448 | −77.204 |
WT-EEMD-GEP | 0.848 | 0.597 | −42.898 | 0.725 | 0.671 | −19.217 | 0.855 | 0.519 | −76.368 | 0.784 | 0.601 | −44.093 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.817 | 0.640 | −51.872 | 0.778 | 0.646 | −23.115 | 0.862 | 0.609 | −84.917 | 0.816 | 0.624 | −50.427 |
WT-EEMD-SVM | 0.816 | 0.640 | −37.888 | 0.777 | 0.679 | −16.883 | 0.851 | 0.646 | −79.140 | 0.808 | 0.651 | −45.485 |
WT-EEMD-GPR | 0.856 | 0.561 | −56.398 | 0.840 | 0.590 | −26.807 | 0.865 | 0.582 | −89.842 | 0.824 | 0.607 | −65.841 |
WT-EEMD-GEP | 0.792 | 0.673 | −28.944 | 0.749 | 0.691 | −9.698 | 0.814 | 0.673 | −51.286 | 0.756 | 0.714 | −34.838 |
SPI 9 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.880 | 0.422 | −44.761 | 0.846 | 0.503 | −31.195 | 0.923 | 0.380 | −101.38 | 0.864 | 0.448 | −88.403 |
WT-EEMD-SVM | 0.872 | 0.431 | −45.561 | 0.854 | 0.499 | −39.058 | 0.878 | 0.389 | −99.606 | 0.860 | 0.457 | −81.984 |
WT-EEMD-GPR | 0.939 | 0.378 | −64.743 | 0.927 | 0.408 | −50.445 | 0.950 | 0.337 | −124.58 | 0.902 | 0.385 | −100.66 |
WT-EEMD-GEP | 0.856 | 0.440 | −38.313 | 0.838 | 0.536 | −25.321 | 0.858 | 0.423 | −75.170 | 0.829 | 0.510 | −51.460 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.892 | 0.527 | −54.457 | 0.873 | 0.579 | −26.074 | 0.869 | 0.564 | −91.277 | 0.868 | 0.569 | −88.403 |
WT-EEMD-SVM | 0.873 | 0.549 | −39.585 | 0.845 | 0.612 | −19.044 | 0.862 | 0.601 | −85.281 | 0.856 | 0.615 | −81.667 |
WT-EEMD-GPR | 0.935 | 0.482 | −59.676 | 0.909 | 0.510 | −30.238 | 0.917 | 0.491 | −101.71 | 0.882 | 0.508 | −95.757 |
WT-EEMD-GEP | 0.855 | 0.606 | −33.712 | 0.795 | 0.645 | −14.939 | 0.834 | 0.655 | −71.802 | 0.799 | 0.687 | −40.298 |
SPI 24 | Tabriz | Urmia | ||||||||||
WT-EEMD-FFNN | 0.913 | 0.335 | −89.254 | 0.891 | 0.396 | −59.850 | 0.930 | 0.282 | −105.24 | 0.909 | 0.291 | −110.25 |
WT-EEMD-SVM | 0.890 | 0.357 | −63.241 | 0.882 | 0.388 | −58.489 | 0.944 | 0.303 | −120.87 | 0.903 | 0.331 | −100.47 |
WT-EEMD-GPR | 0.959 | 0.298 | −96.493 | 0.943 | 0.322 | −76.846 | 0.959 | 0.227 | −138.09 | 0.934 | 0.265 | −115.89 |
WT-EEMD-GEP | 0.888 | 0.364 | −52.550 | 0.862 | 0.403 | −37.790 | 0.953 | 0.289 | −125.24 | 0.901 | 0.344 | −89.190 |
Ardabil | Makoo | |||||||||||
WT-EEMD-FFNN | 0.905 | 0.373 | −89.254 | 0.888 | 0.415 | −64.643 | 0.930 | 0.379 | −132.34 | 0.913 | 0.430 | −111.24 |
WT-EEMD-SVM | 0.897 | 0.400 | −64.048 | 0.889 | 0.442 | −47.216 | 0.925 | 0.386 | −128.47 | 0.904 | 0.453 | −108.87 |
WT-EEMD-GPR | 0.956 | 0.347 | −84.208 | 0.922 | 0.388 | −74.968 | 0.941 | 0.356 | −145.89 | 0.921 | 0.386 | −117.10 |
WT-EEMD-GEP | 0.886 | 0.418 | −35.363 | 0.880 | 0.442 | −42.503 | 0.895 | 0.409 | −101.19 | 0.885 | 0.493 | −93.248 |
Integrated pre-post-processing meta-model approaches results
An attempt was done to combine the results of pre- and post-processing methods and evaluate the simultaneous impact of pre-post-processing on the accuracy of the outputs. The obtained results are listed in Table 4 and shown in Figure 7. According to the results, it can be indicated that the combined method yielded more accurate results regarding the single, pre-processing, and post-processing approaches. It can be seen that both simple and nonlinear ensemble methods caused an increment in applied models' efficiency, respectively, by about 5% to 10% for validation sets and 5%–14% for testing sets compared with pre-processing models. Based on the results listed in Tables 3 and 4, the post-processing models had higher RMSE error criteria in comparison with pre-processing models. Therefore, it can be stated that in enhancing the predictions' accuracy, the pre-processing methods performed more successfully than the post-processing methods. It should be noted that determining the extreme values of the drought index is the most important issue in drought management. Therefore, in drought studies, the extreme values of the investigated index should be calculated since these values represent the further potentials of the drought. Therefore, in the use of different models, their performance in estimating minimum and maximum values of time series should be taken into account. From the results, it could be seen that the extreme values in the SPIs series were calculated more correctly by the pre-post-processing methods. The results showed that in forecasting the SPIs series, the nonlinear ensemble method performed better than the linear method. This issue stated that the efficiency of the linear method is related to the performance of individual models. Therefore, for artificial intelligence models with weak performance, the ensemble model results would not be desirable. However, using a nonlinear pattern for simulation would lead to more appropriate results.
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | RMSE . | DC . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.931 | 0.473 | −59.159 | 0.915 | 0.511 | −53.107 | 0.949 | 0.357 | −113.68 | 0.932 | 0.371 | −109.32 |
WT-EEMD-SLAM | 0.904 | 0.482 | −56.099 | 0.897 | 0.531 | −50.360 | 0.923 | 0.390 | −107.80 | 0.899 | 0.412 | −100.05 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.902 | 0.466 | −65.422 | 0.891 | 0.511 | −57.234 | 0.934 | 0.495 | −104.21 | 0.914 | 0.516 | −97.214 |
WT-EEMD-SLAM | 0.890 | 0.518 | −62.038 | 0.875 | 0.524 | −49.254 | 0.908 | 0.535 | −98.82 | 0.899 | 0.558 | −87.418 |
SPI 9 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.967 | 0.325 | −71.217 | 0.946 | 0.359 | −70.124 | 0.988 | 0.297 | −145.75 | 0.956 | 0.339 | −128.14 |
WT-EEMD-SLAM | 0.948 | 0.344 | −69.922 | 0.936 | 0.375 | −65.231 | 0.974 | 0.307 | −139.53 | 0.936 | 0.350 | −119.07 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.963 | 0.415 | −65.644 | 0.927 | 0.452 | −55.218 | 0.954 | 0.422 | −111.88 | 0.935 | 0.442 | −99.218 |
WT-EEMD-SLAM | 0.944 | 0.439 | −64.450 | 0.918 | 0.473 | −50.911 | 0.940 | 0.437 | −109.84 | 0.916 | 0.467 | −95.057 |
SPI 24 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.978 | 0.256 | −106.14 | 0.952 | 0.283 | −97.216 | 0.988 | 0.195 | −151.89 | 0.953 | 0.231 | −133.10 |
WT-EEMD-SLAM | 0.967 | 0.271 | −104.21 | 0.947 | 0.296 | −90.004 | 0.973 | 0.202 | −149.13 | 0.941 | 0.244 | −130.11 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.985 | 0.298 | −92.629 | 0.973 | 0.334 | −88.217 | 0.971 | 0.306 | −160.47 | 0.951 | 0.324 | −155.24 |
WT-EEMD-SLAM | 0.975 | 0.316 | −90.945 | 0.968 | 0.349 | −81.114 | 0.951 | 0.320 | −157.56 | 0.933 | 0.347 | −150.10 |
Model . | Validation . | Testing . | Validation . | Testing . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
DC . | RMSE . | AIC . | RMSE . | DC . | AIC . | DC . | RMSE . | AIC . | DC . | RMSE . | AIC . | |
SPI 3 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.931 | 0.473 | −59.159 | 0.915 | 0.511 | −53.107 | 0.949 | 0.357 | −113.68 | 0.932 | 0.371 | −109.32 |
WT-EEMD-SLAM | 0.904 | 0.482 | −56.099 | 0.897 | 0.531 | −50.360 | 0.923 | 0.390 | −107.80 | 0.899 | 0.412 | −100.05 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.902 | 0.466 | −65.422 | 0.891 | 0.511 | −57.234 | 0.934 | 0.495 | −104.21 | 0.914 | 0.516 | −97.214 |
WT-EEMD-SLAM | 0.890 | 0.518 | −62.038 | 0.875 | 0.524 | −49.254 | 0.908 | 0.535 | −98.82 | 0.899 | 0.558 | −87.418 |
SPI 9 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.967 | 0.325 | −71.217 | 0.946 | 0.359 | −70.124 | 0.988 | 0.297 | −145.75 | 0.956 | 0.339 | −128.14 |
WT-EEMD-SLAM | 0.948 | 0.344 | −69.922 | 0.936 | 0.375 | −65.231 | 0.974 | 0.307 | −139.53 | 0.936 | 0.350 | −119.07 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.963 | 0.415 | −65.644 | 0.927 | 0.452 | −55.218 | 0.954 | 0.422 | −111.88 | 0.935 | 0.442 | −99.218 |
WT-EEMD-SLAM | 0.944 | 0.439 | −64.450 | 0.918 | 0.473 | −50.911 | 0.940 | 0.437 | −109.84 | 0.916 | 0.467 | −95.057 |
SPI 24 | Tabriz | Urmia | ||||||||||
WT-EEMD-NNEM | 0.978 | 0.256 | −106.14 | 0.952 | 0.283 | −97.216 | 0.988 | 0.195 | −151.89 | 0.953 | 0.231 | −133.10 |
WT-EEMD-SLAM | 0.967 | 0.271 | −104.21 | 0.947 | 0.296 | −90.004 | 0.973 | 0.202 | −149.13 | 0.941 | 0.244 | −130.11 |
Ardabil | Makoo | |||||||||||
WT-EEMD-NNEM | 0.985 | 0.298 | −92.629 | 0.973 | 0.334 | −88.217 | 0.971 | 0.306 | −160.47 | 0.951 | 0.324 | −155.24 |
WT-EEMD-SLAM | 0.975 | 0.316 | −90.945 | 0.968 | 0.349 | −81.114 | 0.951 | 0.320 | −157.56 | 0.933 | 0.347 | −150.10 |
Figure 8 shows the comparison of the variation range of RMSE error criteria of verification data for the applied methods. The plot shows that the distribution ranges of the RMSE are smaller for the WT-EEMD-NNEM model than the other models which could indicate that the forecasting performance of the WT-EEMD-NNEM model is superior to them.
CONCLUSIONS
Drought is a natural disaster which has negative impact on most parts of the environment, therefore, accurate drought forecasting is important for reducing drought negative impact. This study assessed the capability of time series pre-post-processing methods for the SPIs series with time scales of 3, 9, and 24 months in regions with different climates. In this regard, in the first step, the time series without any processing were imposed to single meta-model methods. Then, time series were decomposed to several sub-series using WT and further decomposition was performed via EEMD. Finally, the obtained results from pre-processing methods were combined via two linear and nonlinear post-processing methods in order to enhance the forecasting efficiency. According to the results, it was found that the single artificial intelligence methods led to poor predictions in short-term drought (SPI 3) modeling. These models were successful in modeling the long-term SPI (i.e., SPI 24). It was observed that using both pre-processing and post-processing methods increased the modeling accuracy and the integrated models were more successful in short-, mid-, and long-term drought analysis in sites with different climates. The applied pre-processing methods enhanced the single artificial intelligence method performance approximately from 30% to 45%. However, the obtained results revealed that the pre-processing methods performed more successfully than the SLAM and NNEM post-processing methods. The simple and nonlinear averaging techniques improved the efficiency of the applied models by about 5% to 14% compared with pre-processing models and the performance of the NNEM was higher than SLAM. For single models, the distribution range of the RMSE was 0.337–1.03, the amount of which decreased to 0.195–0.714 for integrated models. Also, it was found that the maximum and minimum values of the SPIs series were well forecasted using the integrated models. Therefore, the integration of hybrid pre-post-processing techniques could be useful for short- to long-term drought modeling.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories (https://climatology.ir).