## Abstract

Drought prediction plays an important guiding role in drought risk management. The standardized precipitation index (SPI) is a popular meteorological drought indicator to measure the degree of drought. The SPI time series is non-stationary, whereas the conventional artificial neural network (ANN) model has limitations to predict non-stationary time series. To overcome this limitation, it is essential to investigate input data preprocessing to improve the ANN model. In this paper, a hybrid model coupled with singular spectrum analysis (SSA) and backpropagation ANN is proposed (SSA-BP-ANN). The advantage of this model is that the SSA of finite-length SPI sequences does not require the adoption of boundary extensions to suppress boundary effects, while the most predictable components of the SPI can be efficiently extracted and incorporated into the model. The proposed SSA-BP-ANN model is tested in case studies at three meteorological stations in Northern Shannxi Province, China. The results show that the SSA-BP-ANN model can produce more accurate predictions than the BP-ANN model. In addition, the performance improvement of SSA on the BP-ANN model is slightly better than wavelet decomposition and empirical mode decomposition. This new hybrid prediction model has great potential for promoting drought early warning in arid regions.

## HIGHLIGHTS

Using SSA as data preprocessing tool for non-stationary time-series SPI could significantly improve the prediction performance of the BP-ANN model.

SSA can extract more effective information from noisy time series and bring it into the prediction model.

The SSA-BP-ANN model seems a promising method for drought early warning in arid regions.

## INTRODUCTION

From a temporal perspective, humans have been struggling with drought for the last thousand years (Dai 2011). From a spatial perspective, drought is one of the most common natural disasters globally, characterized by high frequency, broad impact, long duration and high economic losses (Zhang *et al.* 2015b). Therefore, the research on drought has attracted much attention worldwide (Haile *et al.* 2019; Balti *et al.* 2020; Kaur & Sood 2020).

Drought generally can be divided into four categories: meteorological, agricultural, hydrological and socio-economic droughts (Kimwatu *et al.* 2021; Kolachian & Saghafian 2021; Wu *et al.* 2021; Zhang *et al.* 2021). A chronic lack of precipitation causes an imbalance in the hydrological cycle, the leading cause of the drought. Thus, many scholars have focused their attention on monitoring and preventing meteorological droughts. To quantify the severity of meteorological drought, researchers have developed many meteorological drought indicators (Waseem *et al.* 2015; Zhang *et al.* 2015a; Rehana & Naidu 2021), such as precipitation anomaly percentage (PAP) (Tahroudi *et al.* 2020), standardized precipitation index (SPI) (Mckee *et al.* 1993), standardized precipitation evapotranspiration index (SPEI) (Beguería *et al.* 2014) considering evaporation and palmer drought severity index (PDSI) (Palmer 1965). To reduce the drought influence, many different models have been carried out to predict those drought indicators (Mouatadid *et al.* 2018; Bouaziz *et al.* 2021).

SPI is one of the most commonly used meteorological drought indicators (Hayes *et al.* 2011; Bazrafshan *et al.* 2014; Qaisrani *et al.* 2021). Its calculation applies monthly rainfall data, including 1, 3, 6, 9, 12 and 24-month time scales, which facilitates the determination of drought levels and drought cycles for specific targets (Zhang *et al.* 2017). It is especially recommended when sufficient (at least 30 years) and continuous monthly rainfall data are available (Keyantash 2018). For example, Zhang *et al.* (2019a) used ARMA (autoregressive and moving average) and ARMA-GARCH (generalized autoregressive conditional heteroskedasticity) models to simulate and predict the SPI-9 drought index in Shandong Province, China, from 1965 to 2015. Moreira *et al.* (2018) validated the log-linear model and drought indices (SPI and SPEI) in modeling and predicting the drought class transitions over a sector of Eastern Europe. Durdu (2010) applied the linear model ARIMA (autoregressive integrated moving average) to predict the SPI in the Buyuk Menderes river basin, showing that the model can provide accurate results for short-term drought prediction.

The SPI time series is a typically non-linear, non-smooth series, and the existing linear time-series forecasting models are inadequate to detect the properties contained in such complex hydrological time series (Wei *et al.* 2012). Therefore, machine learning algorithms have been introduced into hydrological systems to efficiently construct predictive models, and the most widely used algorithm is artificial neural network (ANN) algorithms (Zhang *et al.* 2019b; Liu *et al.* 2020). For instance, Kousari *et al.* (2017) explored the potential of ANN in forecasting drought by using the SPI in the Fars Province of Iran. Mokhtarzad *et al.* (2017) compared the ability of ANN, adaptive neuro-fuzzy interface system (ANFIS) and support vector machine (SVM) models to predict the SPI, and the results showed that all three models have excellent predictive performance. Hassanzadeh *et al.* (2020) proved that the ANN model combined with hydrological and meteorological variables can significantly improve the ability of Iranian Tabriz to predict the SPI at different time scales. Achour *et al.*(2020) proposed a feedforward three-layer ANN model to predict the drought 2 months ahead using the SPI series at different time scales (3, 6, 9 and 12 months) and successfully helped the north-west region of Algeria cope with drought in advance.

Although ANN models have been widely applied, their capability to predict non-stationary time series is limited. The wavelet analysis (WA) shows a good effect on the preprocessing of non-stationary time series (Anshuka *et al.* 2019). Therefore, wavelet transform-based preprocessing has been largely applied with ANN models in drought prediction. For example, Belayneh *et al.* (2014) used wavelet transform to do data preprocessing on the model input, to build WA-ANN and WA-SVR (support vector regression) models for long-term SPI prediction, and the results show that the WA-ANN model has the best performance. Djerbouai & Souag-Gamane (2016) developed an ANN model based on wavelet decomposition (WD) preprocessing, and compared with the traditional random model, this new hybrid model has a higher predictive ability. Belayneh *et al.* (2016) found that wavelet transform data preprocessing tools are significantly helpful for improving the performance of machine learning prediction models. Soh *et al.* (2018) proposed two hybrid machine learning prediction models using WD as a data preprocessing (Wavelet-ARIMA-ANN and Wavelet-ANFIS) and applied them to SPEI predictions at different time scales in the Langat River Basin to obtain accurate prediction results. Khan *et al.* (2020) verified that the prediction performance of the WA-ARIMA-ANN hybrid model for the SPI is better than the single model ANN and WA-ANN.

WD preprocessing of model input data can effectively improve the predictive performance of ANN models. The decomposition of a non-stationary time series by the wavelet transform is based on the convolution of the sequence and the filter. When convolving a finite-length sequence, the information outside the sequence boundary is unknown, resulting in boundary effects on the convolution result, and additional boundary extension as required. For the sake of this, the main objective of this research is going to test the potential of singular spectrum analysis (SSA) as a data preprocessing tool for improving model prediction performance. The advantage of SSA is that it does not require any prior information and assumptions. Therefore, SSA does not require boundary extension when decomposing a finite time sequence. By combining SSA and the ANN with a backpropagation algorithm, an improved model (namely SSA-BP-ANN) is proposed to predict the SPI in this study. The proposed model will then be tested in an arid region of northern Shaanxi, China. The meteorological drought index selects the SPI of the 12-month time scale (SPI-12).

The following Section 2 introduces the general situation of the study area and the source of rainfall data. Section 3 describes the applied method, and Section 4 shows the SSA-BP-ANN model development process. Furthermore, Section 5 evaluates the performance of the model, and some main concluding remarks are presented in Section 6.

## STUDY AREA AND DATA

The study area is the entire area under the jurisdiction of Yan'an City and Yulin City in the Loess Plateau region of northern Shaanxi Province (35°02′–39°35′N, 107°15′–110°15′E), with a total area of 79,981.9 km^{2}. The average altitude in the area is about 1,200 m, the annual average precipitation is 400–500 mm, the annual average temperature is 7–12 °C and the annual average evaporation is 1,200 mm. Figure 1 shows the general view of the study area.

The data used in this paper are obtained from the China Meteorological Data Network (available at http://data.cma.cn). The monthly rainfall of Hengshan, Suide and Yulin meteorological stations in Northern Shaanxi from 1955 to 2019 is selected as the data recognized and widely used in meteorological research. Table 1 summarizes the statistical characteristics of monthly rainfall at the three stations from 1955 to 2019.

Station . | Longitude (°) . | Latitude (°) . | Average (mm) . | Maximum (mm) . | Minimum (mm) . | Standard deviation (mm) . |
---|---|---|---|---|---|---|

Hengshan | 109.23 | 37.93 | 31.38 | 208.7 | 0 | 40.78 |

Suide | 110.22 | 37.50 | 37.07 | 365.8 | 0 | 48.75 |

Yulin | 109.78 | 38.27 | 33.98 | 329.9 | 0 | 47.53 |

Station . | Longitude (°) . | Latitude (°) . | Average (mm) . | Maximum (mm) . | Minimum (mm) . | Standard deviation (mm) . |
---|---|---|---|---|---|---|

Hengshan | 109.23 | 37.93 | 31.38 | 208.7 | 0 | 40.78 |

Suide | 110.22 | 37.50 | 37.07 | 365.8 | 0 | 48.75 |

Yulin | 109.78 | 38.27 | 33.98 | 329.9 | 0 | 47.53 |

## METHODOLOGY

### Standardized precipitation index

The SPI is widely used in meteorological drought assessment and prediction research because of its several advantages. (1) The SPI is recommended to the public by the World Meteorological Organization (Hayes *et al.* 2011); (2) the SPI only uses a long series of rainfall data, which makes drought assessment and prediction feasible in some areas where hydrological data such as evaporation are difficult to measure and (3) the SPI can evaluate and predict drought from multiple time scales (Pham *et al.* 2021).

*et al.*2020). Secondly, calculate the cumulative probability of the observed rainfall data. Finally, the inverse normal Gaussian function with a mean value of 0 and a variance of 1 is applied to the cumulative probability to obtain the SPI (Vidyarthi & Jain 2020). The probability density function defined by the gamma distribution is shown in the following equation:where and represent the shape and the scale parameters, they can be obtained by maximum likelihood estimation, as shown in the following equations:where

*m*and

*n*are the numbers of zero precipitation and sample size, respectively. The value of SPI can be calculated according to the following equations (Lloyd-Hughes & Saunders 2002):where

*x*is the monthly rainfall,

*r*

_{0}= 2.515517,

*r*

_{1}= 0.802853,

*r*

_{2}= 0.010328,

*s*

_{0}= 1.432788,

*s*

_{1}= 0.189269 and

*s*

_{2}= 0.001308 (Kisi

*et al.*2019). The drought threshold classification of SPI is shown in Table 2.

Classification . | SPI . |
---|---|

Extremely wet (EW) | SPI>2.0 |

Very wet (VW) | 1.50<SPI≤2 |

Moderately wet (MW) | 1.0<PIz≤1.50 |

Near normal (NN) | −1.0<SPI≤1.0 |

Moderate drought (MD) | −1.50<SPI≤−1.0 |

Severe drought (SD) | −2.0<SPI≤−1.50 |

Extreme drought (ED) | SPI≤−2.0 |

Classification . | SPI . |
---|---|

Extremely wet (EW) | SPI>2.0 |

Very wet (VW) | 1.50<SPI≤2 |

Moderately wet (MW) | 1.0<PIz≤1.50 |

Near normal (NN) | −1.0<SPI≤1.0 |

Moderate drought (MD) | −1.50<SPI≤−1.0 |

Severe drought (SD) | −2.0<SPI≤−1.50 |

Extreme drought (ED) | SPI≤−2.0 |

Referring to past research on meteorological drought in northern Shaanxi, China, the time scale of SPI adopted in this paper is 12 months (SPI-12) (Wang *et al.* 2017, 2020).

### Backpropagation artificial neural networks

The BP-ANN is a concept proposed by scientists Rumelhart *et al.* (1986) in the 1980s. It is a multi-layer feedforward neural network trained by an error backpropagation algorithm, which generally includes an input layer, hidden layer and output layer. The neurons in each layer of this neural network only accept the input of the neurons in the previous layer, and the subsequent layers have no signal feedback to the previous layer. Each layer performs a certain transformation on the input data and then uses the output result as the input of the next layer until the final output result (Zhao *et al.* 2010).

Training the BP-ANN model is mainly divided into three steps: (1) constructing the network model, including determining the input/output data form, network level and transfer function form; (2) cyclic calculation, forward propagation calculation error and backpropagation adjustment parameters and (3) return to the trained neural network model. This paper uses a three-layer BP-ANN model to predict time series, and the network structure is shown in Figure 2.

The applied rainfall data in this study is from 1955 to 2019; thus, the SPI-12 value of 780 months can be calculated to form the entire original SPI-12 time series, which can be recorded as the set *X =*{*X*_{1}, *X*_{2}, …, *X*_{799}, *X*_{780}}. In Figure 2, *X _{t}* (

*t*is an integer and ) represents the SPI-12 value of the

*t*th month of the original SPI-12 time series,

*X*

_{t}_{−1}represents the SPI-12 value 1 month before the

*t*th month and

*X*represents the SPI-12 value

_{t−n}*n*months before the

*t*th month. The value of

*n*will be determined by the partial autocorrelation function (PACF) in Section 4.

### Singular spectrum analysis

SSA is a time-series analysis method widely used in climatology, surveying, oceanography and other fields. Its advantage lies in extracting trends, periods and noise signals from noisy time series without any prior information and gathering the most predictable components into several time series (Baydaroğlu *et al.* 2017). Therefore, it is particularly suitable for analyzing time-series data with periodic oscillations. The specific process of SSA mainly includes four steps (Yu *et al.* 2017; Mi *et al.* 2019).

The elementary matrix in the above formula represents the *i*th SVD component, and the SVD component's energy contribution to the original signal decreases as the order increases.

- 3.
*Grouping*. Dividing the subscript interval of the elementary matrix into P unconnected subsets , denoted as , so the composite matrix . Calculating each composite matrix of , then Equation (16) can be obtained as follows:

Among them, the selection process of is the grouping.

*k*in the formula is the diagonal mean value of the matrix .

### SSA-BP-ANN prediction model

The SSA-BP-ANN prediction model is a new hybrid prediction model, and its specific prediction process is shown in Figure 3. The whole process can be roughly divided into four steps: (1) the division of training set and testing set for the calculated SPI-12 time series, where the training set accounts for 70%; (2) the SSA method is used to decompose the training set, the parameter (window length, *W*_{L}) of the method is determined in the decomposition process and then the testing set is decomposed according to the same *W*_{L}; (3) the decomposed training set is fed into the BP-ANN model for model training and (4) the decomposed testing set is fed into the trained BP-ANN model to predict SPI-12 time series, and the evaluation indices are used to evaluate the performance of the model.

### Model performance evaluation

*et al.*2007). In this study, the evaluation indices include the mean absolute error (MAE), root-mean-square error (RMSE), coefficient of determination (

*R*

^{2}) and Nash–Sutcliffe efficiency coefficient (NSE), as shown in the following equations:where

*N*is the number of samples, is the predicted value, is the observed value and is the mean of the observed value (Fung

*et al.*2019).

In general, the performance indicators MAE (MAE0) and RMSE (RMSE0) are used to reflect the actual situation of the model prediction error. The smaller their values, the better the accuracy of the prediction model in describing the experimental data. The *R*^{2} (0*R*^{2}1) and NSE (NSE1) are used to measure the fitting accuracy of the prediction model. The larger their values, the better the performance of the model and the higher the fitting accuracy.

These indices are not only used to evaluate the performance of the SSA-BP-ANN model, but also used to compare the differences in model performance improvement of different data processing methods.

## SSA-BP-ANN MODEL DEVELOPMENT

### SPI-12 calculation

As the prediction object, the first step is to calculate the time series of SPI-12 based on the observed rainfall data. Figure 4 shows the results of SPI-12 time series for different meteorological stations. This processing follows: (1) applying the *χ*^{2} and Kolgomorov–Smirnov (K–S) tests to detect whether the monthly rainfall series follows the gamma distribution; (2) using Matlab code to calculate the original SPI-12 time series (code available at https://www.csdn.net/) and (3) dividing the total data into two sets: about 70% as the training set (550) and about 30% as the testing set (230).

### Singular spectrum decomposition of the training set and testing set

The object of singular spectral decomposition (Bonizzi *et al.* 2014) is the training set and testing set data. The purpose of this is to reduce the influence of noise, extract more predictable information and do data preprocessing for the input of subsequent models. Two parameters need to be set when using the SSA method, namely *W*_{L} and the number of groups (*m*). If the value of the *W*_{L} is small (e.g., *W*_{L}≤20), *m* is meaningless, because the grouping may hide some useful information (Zuo *et al.* 2020). In this work, *W*_{L} is set to be 12 because the period of SPI-12 is 12 months. Therefore, the original SPI-12 time series will be decomposed and ordered into 12 subsequences. After determining the appropriate *W*_{L}, the training set is first decomposed to verify the reliability of *W*_{L}. If *W*_{L} is reasonable, the same *W*_{L} is used to decompose the testing set. This approach can effectively avoid bringing the future information of the testing set into the training process of the model.

For the subsequences after singular spectral decomposition, we would like to name the first subsequence as the trend component, the last subsequence as the noise component and all the intermediate subsequences as the periodic component. From Figures 5–7, the original SPI-12 time series of three stations are all decomposed into 1 trend component, 10 periodic components and 1 noise component. The trend component has the largest amplitude, the lowest frequency and the longest wavelength, which roughly reflects the changing trend of the original SPI-12 time series. The amplitude of the periodic components 1–10 gradually decreases, the frequency gradually increases and the wavelength gradually decreases, reflecting the fluctuation of the original SPI-12 time series. Each subsequence shows the characteristics of the original sequence.

### BP-ANN model development

The BP-ANN model is developed in two steps. First, the forecasting strategy is determined for forecasting 1 month ahead SPI using the past SPI data, namely ‘decomposition-forecast-reconstruction’, and the process is shown in Figure 8.

From Figure 8, this strategy predicts each component separately and finally adds the prediction results of each component to obtain the predicted value of the original SPI-12 sequence. The input variables for the BP-ANN model were identified based on PACF values between the component of the training set at the time step *t* and the corresponding lagged components. The PACF of SPI-12 components in the training set of three stations is shown in Figures 9–11.

By drawing the PACF diagram of each component of SPI-12, find out the lag factor outside the confidence interval (blue line), which is the predictor. For example, for the trend component in Figure 9, the number of lags outside the confidence interval is 1, 2 and 3. Therefore, these three lags are selected as the input variables of the model, and the same number of lags is selected as the input variables of this component in the testing set. Based on the ‘decomposition-forecast-reconstruction’ strategy, the predictors of all components selected by the PACF are shown in Table 3. In Table 3, the SPI-12 time series of three stations are all decomposed into 12 components, and each component has the same length as the original time series. Therefore, each component can also be expressed in the form of a set *X** =*{*X*_{1}, *X*_{2}, …, *X*_{799}, *X*_{780}}. As described in Section 3, *X _{t}* (

*t*is an integer and ) represents the

*t*th value in the set

*X.*Using the trend component of the Suide station in Table 3 as an example, the model structure shown in Figure 2 and the value of

*n*determined in Figure 9 (

*n*= 3) are used. Then the input layer of the BP-ANN model has three input variables, , and , and the output layer is

*X*.

_{t}Stations . | SSA component . | Model input variables . | Model output . |
---|---|---|---|

Suide | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise | |||

Hengshan | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise | |||

Yulin | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise |

Stations . | SSA component . | Model input variables . | Model output . |
---|---|---|---|

Suide | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise | |||

Hengshan | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise | |||

Yulin | Trend | ||

Periodic-1 | |||

Periodic-2 | |||

Periodic-3 | |||

Periodic-4 | |||

Periodic-5 | |||

Periodic-6 | |||

Periodic-7 | |||

Periodic-8 | |||

Periodic-9 | |||

Periodic-10 | |||

Noise |

In the second step, using the training set data to train the BP-ANN model to get the best network structure, and bring in the testing set data to get the final prediction result. The BP-ANN model developed in this research is composed of three layers. At present, the input layer and output layer of each SPI-12 component have been determined. Next, the optimal number of neurons in the hidden layer is determined by using the trial and error method. To this end, the number of neurons in the hidden layer ranges from 1 to 15, and the most commonly used Levenberg–Marquardt (LM) algorithm is used to train the neural network (Hagan & Menhaj 1994).

In the development process of the model, the transfer function of the hidden layer is a tangent sigmoid function (tansig), and the transfer function of the output layer is a linear function (purelin). The model with the minimum performance standards MAE and RMSE is selected as the best network structure. The model parameters are set as follows: training times are 10,000, the learning rate is 0.01, the minimum error of training target is 0.01 and the maximum number of confirmation failures is 6. Therefore, the optimal network structure of all components of the three stations is shown in Table 4.

Stations . | SSA component . | Network architecture . | Training algorithm . | Hidden transfer function . | Output transfer function . |
---|---|---|---|---|---|

Suide | Trend | 3-9-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-9-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-3-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-4-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-4-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-3-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-10-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 6-3-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 6-1-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-10-1 | Trainlm | Tansig | Purelin | |

Noise | 5-5-1 | Trainlm | Tansig | Purelin | |

Hengshan | Trend | 3-8-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-8-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-10-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-1-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-1-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 6-5-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 6-6-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-7-1 | Trainlm | Tansig | Purelin | |

Noise | 6-10-1 | Trainlm | Tansig | Purelin | |

Yulin | Trend | 3-3-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-1-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-3-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-2-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-3-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 5-7-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-5-1 | Trainlm | Tansig | Purelin | |

Noise | 7-2-1 | Trainlm | Tansig | Purelin |

Stations . | SSA component . | Network architecture . | Training algorithm . | Hidden transfer function . | Output transfer function . |
---|---|---|---|---|---|

Suide | Trend | 3-9-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-9-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-3-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-4-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-4-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-3-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-10-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 6-3-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 6-1-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-10-1 | Trainlm | Tansig | Purelin | |

Noise | 5-5-1 | Trainlm | Tansig | Purelin | |

Hengshan | Trend | 3-8-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-8-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-10-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-6-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-1-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-1-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 6-5-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 6-6-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-7-1 | Trainlm | Tansig | Purelin | |

Noise | 6-10-1 | Trainlm | Tansig | Purelin | |

Yulin | Trend | 3-3-1 | Trainlm | Tansig | Purelin |

Periodic-1 | 3-1-1 | Trainlm | Tansig | Purelin | |

Periodic-2 | 3-3-1 | Trainlm | Tansig | Purelin | |

Periodic-3 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-4 | 5-2-1 | Trainlm | Tansig | Purelin | |

Periodic-5 | 5-3-1 | Trainlm | Tansig | Purelin | |

Periodic-6 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-7 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-8 | 5-7-1 | Trainlm | Tansig | Purelin | |

Periodic-9 | 5-9-1 | Trainlm | Tansig | Purelin | |

Periodic-10 | 6-5-1 | Trainlm | Tansig | Purelin | |

Noise | 7-2-1 | Trainlm | Tansig | Purelin |

## RESULTS AND DISCUSSIONS

### Assessment of SSA-BP-ANN model performance

The performance of both BP-ANN and SSA-BP-ANN models is assessed statistically by the evaluation indices as shown above. The corresponding results are present in Table 5 as values and in Figure 12 as a Taylor diagram.

Stations . | Models . | R^{2}
. | NSE . | MAE . | RMSE . |
---|---|---|---|---|---|

Suide | BP-ANN | 0.73 | 0.73 | 0.36 | 0.55 |

SSA-BP-ANN | 0.86 | 0.86 | 0.24 | 0.39 | |

Hengshan | BP-ANN | 0.78 | 0.78 | 0.29 | 0.43 |

SSA-BP-ANN | 0.88 | 0.87 | 0.21 | 0.33 | |

Yulin | BP-ANN | 0.77 | 0.73 | 0.32 | 0.48 |

SSA-BP-ANN | 0.88 | 0.86 | 0.21 | 0.35 |

Stations . | Models . | R^{2}
. | NSE . | MAE . | RMSE . |
---|---|---|---|---|---|

Suide | BP-ANN | 0.73 | 0.73 | 0.36 | 0.55 |

SSA-BP-ANN | 0.86 | 0.86 | 0.24 | 0.39 | |

Hengshan | BP-ANN | 0.78 | 0.78 | 0.29 | 0.43 |

SSA-BP-ANN | 0.88 | 0.87 | 0.21 | 0.33 | |

Yulin | BP-ANN | 0.77 | 0.73 | 0.32 | 0.48 |

SSA-BP-ANN | 0.88 | 0.86 | 0.21 | 0.35 |

From Table 5, the fitting accuracy of the SSA-BP-ANN model is significantly improved than that of the BP-ANN model, and the prediction errors are reduced to varying degrees. At the Suide station, the SSA-BP-ANN model produces *R*^{2} as 0.86, NSE as 0.86, both 18% higher compared to the BP-ANN model, and the MAE and the RMSE are 33 and 39% lower. At the Hengshan station, the SSA-BP-ANN model produces *R*^{2} as 0.88, NSE as 0.87, both 13 and 12% higher compared to the BP-ANN model, and the MAE and the RMSE are 28 and 23% lower. At the Yulin station, the SSA-BP-ANN model produces *R*^{2} as 0.88, NSE as 0.86, both 14 and 18% higher compared to the BP-ANN model, and the MAE and the RMSE are 34 and 27% lower.

Figure 12 displays the correlation coefficient, root mean square difference (RMSD) and standard deviation between the predicted and observed values in the testing set of three stations. According to Figure 12, the general performance of SSA-BP-ANN provides more accurate prediction results than the BP-ANN model. For example, the correlation coefficients of the SSA-BP-ANN model are all higher than the BP-ANN model, which are closer to 0.95. On the other hand, the RMSD values from the SSA-BP-ANN model are all lower than the BP-ANN model, which are closer to 0. These results above indicate the proposed approach to data preprocessing could significantly improve the performance of the model, which agrees with the conclusions from other studies (Khan *et al.* 2018; Mehdizadeh *et al.* 2020; Hinge & Sharma 2021).

Figure 13 shows the observed and predicted time series for the testing set at the three stations. In general, the fitting from the BP-ANN model is significantly worse than the SSA-BP-ANN model, especially when predicting the valley value of the time series. From Figure 13, nearly one-third of the predicted results time series is lower than the valley values of the observed values. On the other hand, the SSA-BP-ANN model is more accurate in predicting peaks and valleys, since the time-shift error is significantly lower than the BP-ANN model. This difference confirms the importance of the data preprocessing method for BP-ANN model training.

### Comparison between data processing methods

The SSA data processing method used in this paper is a new attempt in the prediction of drought by the ANN model. However, currently the most widely used data processing methods are WD and empirical mode decomposition (EMD), and we have also conducted experiments to compare these three methods. Their detailed principles can be obtained from Özger *et al.* (2020). Here, we directly present their performance by the evaluation indices applied in Table 6.

Stations . | Models . | R^{2}
. | NSE . | MAE . | RMSE . |
---|---|---|---|---|---|

Suide | EMD-BP-ANN | 0.87 | 0.88 | 0.13 | 0.37 |

WD-BP-ANN | 0.86 | 0.86 | 0.26 | 0.40 | |

Hengshan | EMD-BP-ANN | 0.87 | 0.88 | 0.23 | 0.33 |

WD -BP-ANN | 0.88 | 0.88 | 0.21 | 0.33 | |

Yulin | EMD-BP-ANN | 0.89 | 0.89 | 0.23 | 0.34 |

WD -BP-ANN | 0.88 | 0.88 | 0.21 | 0.35 |

Stations . | Models . | R^{2}
. | NSE . | MAE . | RMSE . |
---|---|---|---|---|---|

Suide | EMD-BP-ANN | 0.87 | 0.88 | 0.13 | 0.37 |

WD-BP-ANN | 0.86 | 0.86 | 0.26 | 0.40 | |

Hengshan | EMD-BP-ANN | 0.87 | 0.88 | 0.23 | 0.33 |

WD -BP-ANN | 0.88 | 0.88 | 0.21 | 0.33 | |

Yulin | EMD-BP-ANN | 0.89 | 0.89 | 0.23 | 0.34 |

WD -BP-ANN | 0.88 | 0.88 | 0.21 | 0.35 |

From Tables 5 and 6, the fitting accuracy of the BP-ANN model displays significant improvement after data processing, and the *R*^{2} and NSE of all hybrid models are above 0.85. Meanwhile, the three data processing methods show almost identical prediction performance in terms of various statistical indicators. This indicates that the SSA method has great potential in improving model performance.

Meanwhile, the Kruskal–Wallis (KW) test has been done on each forecast time series to evaluate the statistical significance of all models (Başakın *et al.* 2021), and the result is shown in Table 7.

Stations . | Models . | P-value
. | *H_{0}
. |
---|---|---|---|

Suide | BP-ANN | 0.7252 | Reject |

SSA-BP-ANN | 0.9134 | Reject | |

EMD-BP-ANN | 0.9034 | Reject | |

WD-BP-ANN | 0.9262 | Reject | |

Hengshan | BP-ANN | 0.4597 | Accept |

SSA-BP-ANN | 0.8978 | Reject | |

EMD-BP-ANN | 0.7247 | Reject | |

WD-BP-ANN | 0.9017 | Reject | |

Yulin | BP-ANN | 0.3021 | Accept |

SSA-BP-ANN | 0.8202 | Reject | |

EMD-BP-ANN | 0.8862 | Reject | |

WD-BP-ANN | 0.8109 | Reject |

Stations . | Models . | P-value
. | *H_{0}
. |
---|---|---|---|

Suide | BP-ANN | 0.7252 | Reject |

SSA-BP-ANN | 0.9134 | Reject | |

EMD-BP-ANN | 0.9034 | Reject | |

WD-BP-ANN | 0.9262 | Reject | |

Hengshan | BP-ANN | 0.4597 | Accept |

SSA-BP-ANN | 0.8978 | Reject | |

EMD-BP-ANN | 0.7247 | Reject | |

WD-BP-ANN | 0.9017 | Reject | |

Yulin | BP-ANN | 0.3021 | Accept |

SSA-BP-ANN | 0.8202 | Reject | |

EMD-BP-ANN | 0.8862 | Reject | |

WD-BP-ANN | 0.8109 | Reject |

*H_{0} represents a hypothesis based on the significant difference between mean predicted and observed values.

From Table 7, the H_{0} hypothesis was rejected for all three hybrid models in the KW test, indicating that there is no significant difference between the means of predicted and observed values of the model and that the model accuracy was high. On the other hand, there is a significant difference between the means of predicted and observed values of the BP-ANN model at the Hengshan and Yulin stations, and the model accuracy is low. Therefore, the three SSA, EMD and WD methods have improved the reliability and accuracy of model prediction results to a certain extent.

To compare these methods more intuitively, we also produced the Taylor diagram (Figure 14) and a box plot (Figure 15) to present the evaluation results.

From Figure 14, three hybrid models provide high prediction accuracy, and the differences between the different data processing methods are neglectable. The correlation coefficients of the three hybrid models are all higher than the BP-ANN model and close to 0.95.

From Figure 15, the three hybrid models are overall closer to 0 in terms of the absolute value of prediction error (AVE) than the BP-ANN model, and the variation range is slightly lower than the BP-ANN model. It again confirms the notion that the data processing method can improve the prediction performance of the model. Among them, the average value of the AVE of the SSA method is the smallest over three stations, indicating that the SSA method is slightly better than the EMD and WD methods in SPI prediction.

## CONCLUSION

In this study, a prediction model coupled with SSA and BP-ANN was established and applied to the prediction of SPI-12 1 month ahead at three stations in northern Shaanxi, China. The following conclusions can be drawn from the study:

The result validates that by preprocessing the singular spectrum decomposition of the input data of the BP-ANN model, the prediction performance of the model is greatly improved.

The SSA method can effectively extract the valuable information hidden in the SPI-12 time series, gather and reconstruct the most predictable components into several meaningful subsequences, thereby reducing the influence of noise.

The SSA-BP-ANN model is a promising new approach to predict the SPI in arid regions, such as northern Shaanxi, China.

The current research focused on the singular spectrum decomposition preprocessing of the data, which can significantly improve the performance of the BP-ANN prediction model. In future research, other data preprocessing methods will be used to screen out the best hybrid model.

## ACKNOWLEDGEMENTS

This study was funded by the Natural Science Basic Research Program of Shaanxi Province (Grant Nos 2019JLZ-16 and 2019JLZ-15) and the Science and Technology Program of Shaanxi Province (Grant Nos 2019slkj-13, 2020slkj-16 and 2018slkj-4). The authors thank the editor for their comments and suggestions.

## CONFLICT OF INTEREST

The authors have no relevant financial or non-financial interests to disclose.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## REFERENCES

*The Climate Data Guide: Standardized Precipitation Index (SPI)*. National Center for Atmospheric Research Staff (Eds). Available from: https://climatedataguide.ucar.edu/climate-data/standardized-pre-cipitation-index-spi (last modified 7 August 2018).