Reliable drought prediction plays a significant role in drought management. Applying machine learning models in drought prediction is getting popular in recent years, but applying the stand-alone models to capture the feature information is not sufficient enough, even though the general performance is acceptable. Therefore, the scholars tried the signal decomposition algorithm as a data pre-processing tool, and coupled it with the stand-alone model to build ‘decomposition-prediction’ model to improve the performance. Considering the limitations of using the single decomposition algorithm, an ‘integration-prediction’ model construction method is proposed in this study, which deeply combines the results of multiple decomposition algorithms. The model tested three meteorological stations in Guanzhong, Shaanxi Province, China, where the short-term meteorological drought is predicted from 1960 to 2019. The meteorological drought index selects the Standardized Precipitation Index on a 12-month time scale (SPI-12). Compared with stand-alone models and ‘decomposition-prediction’ models, the ‘integration-prediction’ models present higher prediction accuracy, smaller prediction error and better stability in the results. This new ‘integration-prediction’ model provides attractive value for drought risk management in arid regions.

  • Machine learning model has great value in short-term meteorological drought prediction.

  • Signal decomposition algorithm as a data pre-processing tool can significantly improve the prediction performance of machine learning model.

  • Deeply combining the results of multiple decomposition algorithms could achieve higher prediction accuracy.

  • The ‘integration-prediction’ model provides a new way for drought prediction in arid regions.

Climate change leads to a high incidence of natural disasters, especially in 2022, the world has experienced extremely high temperatures that are rare for decades. Drought disasters negatively affected many countries and regions of varying degrees, including China, Europe and North America. Studies indicate that the severity and duration of drought are expected to show an upward trend in the future as human activities and climate change intensify (Li et al. 2021). The essence of drought disaster is that reduction of precipitation will lead to a decrease in crop yield and river level, ultimately causing the loss of human society and economy. After years of exploration, scholars classified drought according to several drought-affected objects, and four types recognized by the academic circle are meteorological drought, agricultural drought, hydrological drought and socio-economic drought (Mishra & Singh 2010; Zhang & Jia 2013). Among them, meteorological drought is always the first to occur, so strengthening meteorological drought prediction is not only the best way to eliminate or reduce the negative effects of drought in advance, but also the core link in constructing efficient water safety management.

Drought is a long lasting process, although it is difficult to accurately define the specific time of its beginning and end, researchers still evaluate drought through meteorological and hydrological parameters, such as precipitation, soil moisture content, temperature and runoff, then develop many drought indices for quantitative calculation of drought degree based on parameter evaluation (Dai et al. 2020; Song et al. 2020). Applying drought indices can analyze the temporal and spatial changes of historical drought characteristics. In addition, the main content of meteorological drought prediction is to use mathematical models to predict time series of drought indices over some future period, thus providing an important basis for decision makers to judge the trend of drought (Wang et al. 2020). There are many widely used meteorological drought indices, for instance, ‘Rainfall Anomaly Index’ (RAI) (Van-Rooy 1965), ‘Standardized Precipitation Index’ (SPI) (McKee et al. 1993), ‘Comprehensive Index of Meteorological Drought’ (CI) (He et al. 2014), ‘Palmer Drought Severity Index’ (PDSI) (Palmer 1965) and ‘Standardized Precipitation Evapotranspiration Index’ (SPEI) (Vicente-Serrano et al. 2010). Yu et al. (2013) collected meteorological data from 16 meteorological stations in Yunnan Province, China from 1956 to 2010, and adopted CI to analyze the frequency, scope and severity of meteorological drought in the province. Fang et al. (2018) studied the occurrence and evolution of meteorological drought in Ningxia, China during 1960–2016, and analyzed the interannual characteristics of drought by utilizing SPEI at different time scales. Mehta & Yadav (2021) calculated the RAI and SPI based on precipitation data from 1901 to 2002 in the Barmer District of Rajasthan State to assess the local meteorological drought characteristics. The evolution of drought characteristics in a certain region in recent decades can be evaluated by employing historical time series of meteorological drought indices. However, this is not enough to deal with the unknown challenges brought by strong climate change to formulation of relevant drought relief policies. In this sense, the importance of meteorological drought prediction is growing.

With the rapid development of artificial intelligence technology, drought prediction models have gradually changed from physical models to data-driven machine learning (ML) models. Compared with physical models, ML models have faster computation speed, lower resource consumption and reliable accuracy (Abbot & Marohasy 2014). In the existing literature, a large number of meteorological drought prediction models based on ML technology have been developed and achieved good results. These models can be roughly divided into two categories: stand-alone models and hybrid models. Stand-alone models refer to bringing drought index time series directly into an ML model for training to complete model development, such as Deo et al. (2018) proposed a Support Vector Regression (SVR) model to predict SPEI at nine stations in Australia and demonstrated that the SVR is very effective in predicting drought characteristics. Achour et al. (2020) explored the potential of artificial neural network (ANN) model and SPI to predict drought in the plains of northwestern Algeria, with satisfactory results. Almikaeel et al. (2022) used stand-alone models ANN and SVR to predict the Gidra River hydrological drought index and achieved excellent results beyond expectations. These examples indicate that stand-alone models are feasible and have great potential in the field of drought prediction. Scholars then proposed various improvement methods on the basis of stand-alone models, thus evolving another type of hybrid model with better performance.

There are many ways to compose the hybrid models, but they can be mainly divided into two parts according to the improvement ideas. One part starts from the improvement of the model itself, combining the optimization algorithm with a stand-alone model or directly combining multiple models. For example, Malik et al. (2021) proved SVR model based on Particle Swarm Optimization (PSO) and Harris Hawkes Optimization (HHO) algorithms outperformed the SVR model in predicting Effective Drought Index (EDI) in Uttarakhand, India. Danandeh Mehr et al. (2022) combined the Long Short-Term Memory (LSTM) network with Convolutional Neural Network (CNN) to form a hybrid model, predicted SPEI-3 and SPEI-6 in Ankara province, Turkey, experimental results show that the performance of CNN-LSTM model is more successful than single benchmark model. This kind of hybrid model successfully improves the prediction accuracy, but the improvement range is sometimes not ideal. Therefore, considering that meteorological drought index time series has characteristics of nonlinear, unstable and multi-scale. When stand-alone models are making autoregressive predictions, the feature information that can be captured from input time series is less and not prominent, which reduces the performance of models to a certain extent (Djerbouai & Souag-Gamane 2016). To solve this problem, some scholars introduce the signal decomposition algorithms into stand-alone models to build another part ‘decomposition-prediction’ type hybrid models (Belayneh et al. 2014). Common signal decomposition algorithms include Wavelet Decomposition (WD), Empirical Mode Decomposition (EMD), Ensemble Empirical Mode Decomposition (EEMD) and Variational Mode Decomposition (VMD) (Zuo et al. 2020). During the modeling process, decomposition of the input time series can extract a more predictable part of the sequence and bring it into the model, reducing the adverse impact of noise (Adarsh & Janga Reddy 2019). For instance, Khan et al. (2020) developed a hybrid model combining WD, Autoregressive Integrated Moving Average (ARIMA) and ANN to predict future SPI in Malaysia's Langat River Basin, and the results show that the performance of hybrid prediction model (WD-ARIMA-ANN) is better than stand-alone prediction model. Roushangar et al. (2021) used WD and EEMD to decompose the SPI time series of the input ML model, which improved the model performance by 40% when predicting drought in northwest Iran. Citakoglu & Coşkun (2022) tried to introduce VMD, WD and EMD as data pre-processing tools to build a hybrid ML model, and predicted the short-term meteorological drought in northwest Turkey very accurately. In general, the prediction performance of hybrid models is obviously better than stand-alone models, among which the hybrid models using data pre-processing tools are almost the best performing and very popular category at present. However, different signal decomposition algorithms have their own advantages and disadvantages when extracting time series features. How to integrate the advantages of different algorithms to reduce the limitations of using a single algorithm is a scientific problem worth studying.

Therefore, considering the diversity of signal decomposition algorithms and their importance to short-term meteorological drought prediction models, this study aims to propose a new ‘integration-prediction’ model construction method that deeply combines the results of multiple decomposition algorithms. In the subsequent study, based on the Guanzhong region of Shaanxi Province, China, we adopt Gate Recurrent Unit (GRU) and Light Gradient Boosting Machine (LGBM) as stand-alone prediction models, and combine the stand-alone models with different decomposition algorithms (EMD, EEMD and VMD) to construct the ‘decomposition-prediction’ models commonly described in literature and the ‘integration-prediction’ models proposed in this paper. All models are carried out with extensive comparative discussion on the results by using some evaluation indicators. This study supplements the current short-term meteorological drought prediction to a certain extent, it explores for the first time whether the ‘integration-prediction’ model can reduce the limitations brought by single decomposition in drought prediction and further improve the model performance.

Guanzhong region (33°30′–35°40′N, 106°30′–110°30′E, average elevation is 500 m) is in the central part of Shaanxi Province, China, with an area of about 55,600 km2. Terrain is high in the southwest, low in the northeast, and the central part is a flat and wide plain. The annual average temperature is 12.9 °C, and annual average precipitation is 580 mm. Figure 1 shows the general view of the study area. Guanzhong region is an important agricultural production area in China, but this region often faces the problem of drought, it is very essential to conduct drought investigation and research. Through relevant research, we can understand the climate and water resources in the Guanzhong region, enhance drought event management and provide scientific basis for the formulation of drought measures.
Figure 1

Location of the study area and meteorological stations.

Figure 1

Location of the study area and meteorological stations.

Close modal
In this study, monthly precipitation data from Pucheng, Fengxiang and Wugong meteorological stations from 1960 to 2019 are used. The data comes from National Meteorological Information Center (NMIC, available at http://data.cma.cn/), its homogeneity, stationary and consistency are strictly examined by authorities before release, so quality is reliable. The change in annual precipitation in Pucheng, Fengxiang and Wugong meteorological stations from 1960 to 2019 is shown in Figure 2.
Figure 2

Annual precipitation change from 1960 to 2019 (a) Pucheng station, (b) Fengxiang station and (c) Wugong station.

Figure 2

Annual precipitation change from 1960 to 2019 (a) Pucheng station, (b) Fengxiang station and (c) Wugong station.

Close modal

From Figures 1 and 2, three meteorological stations are evenly distributed in the Guanzhong region. Although they all belong to the warm temperate continental monsoon climate, there are obvious differences in annual average precipitation among the stations, Pucheng station is 511.3 mm, Fengxiang station is 636.8 mm and Wugong station is 590.9 mm. In addition, the annual average temperature at the three stations is slightly different, Pucheng station is 13.8 °C, Fengxiang station is 11.8 °C and Wugong station is 13.3 °C.

Standardized Precipitation Index (SPI)

SPI can quantitatively analyze the shortage of precipitation in multiple periods and apply to assessment on a scale of over one month. As the calculation of this index only requires a long series of continuous monthly precipitation data (no less than 30 years), and can be set in a flexible time scale (1, 3, 6, 9, 12 and 24 months). Therefore, a good disaster indication effect can be got in different regions and different time periods, which is recognized and widely applied by many drought hydrological researchers. SPI uses gamma distribution probability to represent the change of precipitation within a period. After standardized normal processing, it gets cumulative frequency distribution of standardized precipitation to characterize the dry and wet conditions. Probability distribution of precipitation x with respect to gamma function is shown in Equation (1):
(1)
where and represent shape and scale parameters of gamma function, and x represents the accumulated precipitation. SPI calculation formula after standardized normal processing is shown in Equation (2):
(2)
where k is the positive and negative coefficient of probability density, when , , ; , . In addition, , , , , and . Detailed SPI calculation principles and drought classification tables can be made available in the literature of Kisi (Kisi et al. 2019).

Signal decomposition algorithms

Empirical Mode Decomposition (EMD)

EMD is an adaptive data processing algorithm proposed by Huang et al. (1998) in 1998, which can decompose the original time series into several near-periodic Intrinsic Mode Functions (IMF) and trend items. IMF is a random oscillating function with different amplitudes and frequencies, it must meet the following two conditions: (1) The number of IMF extreme points must be equal to the number of zero crossings, or at most differ by 1; (2) The upper and lower envelope composed of IMF local maxima and minima, whose mean value at any time is 0. EMD is especially suitable for nonlinear and non-stationary signal process processing with noise. Through EMD decomposition, the original time series can be expressed as the sum of several IMF components and one residual component, shown in Equation (3):
(3)
where the residual is a constant or a monotone function.

Ensemble Empirical Mode Decomposition (EEMD)

With the deepening of research, scholars found that the EMD algorithm is prone to mode mixing phenomenon, which will lead to multiple IMF components containing duplicate information. To address this shortcoming, Wu & Huang (2009) proposed the EEMD algorithm in 2009, which improved the EMD algorithm by adding auxiliary white noise to the original sequence many times. The specific steps are:

  • (1)

    A new time series is generated by adding white noise with an amplitude of to the original time series .

  • (2)

    The EMD method is used to decompose , it can get a new set of components and residual .

  • (3)

    The above two steps are repeated M times to get M group components and residuals . The EEMD decomposition results can be got by calculating the arithmetic average of M group components and residuals respectively, as shown in Equation (4):

(4)
where and represent components and residuals obtained by EMD decomposition after white noise is added for the jth time.

EEMD is a recursive algorithm, and it needs to determine two parameters, white noise amplitude and total times of adding white noise M. The should not be too small, otherwise, it may not produce the extreme value change required by EMD. Additionally, increasing M can reduce the effect of white noise to a negligible level. This paper sets M and to 100 and 0.2, respectively.

Variational Mode Decomposition (VMD)

VMD is first proposed by Dragomiretskiy and Zosso in 2014 (Dragomiretskiy & Zosso 2014). This decomposition algorithm has been widely used in signal processing due to its characteristics of self-adaptation and complete non-recursion. VMD adaptively matches the optimal center frequency and limited bandwidth of each mode in the decomposition process by determining the number of decomposition modes in advance to achieve effective separation of IMF. The detailed steps of VMD decomposition are:
  • (1)
    A variational problem is constructed. The constraint is that the sum of components obtained by decomposition is the original signal, constraint expression is shown in Equation (5):
    (5)
    where f is the original signal, K is the number of decomposition modes determined in advance, and is the Kth mode component after decomposition and corresponding central frequency, t is the time variable, j is the imaginary number symbol, * is the convolution operation, is the Dirac function, is the exponential signal of the original signal and is the partial derivative operation.
  • (2)

    To solve the variational problem. Lagrange multiplication operator and quadratic penalty factor are introduced, and the augmented Lagrangian expression obtained is shown in Equation (6):

(6)
where is inner product operation, and is the square of L2 norm.
  • (3)

    Find each mode component and center frequency. The saddle points of augmented Lagrangian function are searched, and the optimal results of , and λ are found alternately after iteration.

Compared with EMD and EEMD, VMD has stronger mathematical theory support, and can decompose several more stable subsequences from complex non-stationary time series under the condition of reducing mode mixing.

Stand-alone machine learning model

Gated Recurrent Unit (GRU)

GRU is a kind of Recurrent Neural Network (RNN) (Mei et al. 2022). Similar to LSTM, it is proposed to solve problems, such as the failure of long-term memory in RNN and the gradient in backpropagation. The internal structure of the LSTM model is complex and training time-consuming. As a simplified variant of LSTM, GRU improves the training efficiency of the model by reducing structural parameters of the gated neural network. The structure of GRU neurons is shown in Figure 3. There are only two gates, the reset gate and the update gate , which can still ensure high accuracy. The mathematical operation of GRU is shown in Equation (7):
(7)
where is the input variable at time t, is the output result of the previous hidden layer, is the output result of the hidden layer of this unit, is the candidate state of the hidden layer, is the connection between vectors, , and is the trainable parameter matrix, l is the identity matrix, is Hadamard product, is the gated activation function sigmoid and is the activation function when candidate memories are generated.
Figure 3

GRU neuron structure.

Figure 3

GRU neuron structure.

Close modal

Light Gradient Boosting Machine (LGBM)

LGBM is an ensemble learning model based on the Gradient Boosting Decision Tree (GBDT) algorithm proposed by Microsoft in 2017 (Ke et al. 2017). It can solve the shortcomings of the GBDT algorithm, such as long training time and large memory consumption, while ensuring high accuracy. The main improvement measures of LGBM are reflected in four aspects: (1) Gradient-based One-Side Sampling (GOSS). It can guarantee the distribution of sample data does not change, and give more attention to data with a large gradient when tree nodes are split; (2) Exclusive Feature Bundling (EFB). EFB bundles mutually exclusive features together to reduce the number of features and improve the training speed of the model; (3) A decision tree algorithm based on histogram. LGBM speeds up tree growth through the histogram algorithm during iteration without the need to traverse all data multiple times, thus saving time and space overhead; (4) Leaf-wise growth strategy with depth limit, as shown in Figure 4. Leaf-wise growth strategy avoids the over-fitting caused by too deep growth of the decision tree and achieves higher accuracy.
Figure 4

Leaf-wise growth strategy.

Figure 4

Leaf-wise growth strategy.

Close modal

Model Evaluation Index

For the evaluation of model performance, this study starts from two perspectives of model prediction accuracy and prediction error. The root mean square error (RMSE), mean square error (MSE) and Nash–Sutcliffe efficiency (NSE) coefficient are used to test the comprehensive prediction effect of the model. These three evaluation indices are also very popular in daily research of drought prediction, whose calculation formulas are shown in Equations (8)–(11).
(8)
(9)
(10)
(11)
where N is the total number of samples observed, and are the predicted and observed value, is the mean of the observed value. NSE can quantify the prediction accuracy of the simulation model, its value range is [−∞,1], and the closer value is to 1, the better the prediction performance of the model is. MSE and RMSE can evaluate the average error of the model predicted value, its value range is [0, + ∞], and the closer value is to 0, the better prediction performance of the model is.

Prediction method framework

At the initial stage of data collection and pre-processing, reasonable SPI time scale selection will have a decisive impact on the accuracy of drought prediction models. Many studies have indicated that there are significant differences in the predictive ability of various ML models for SPI at different time scales. With the increase of SPI time scale, the predictive performance of models will be greatly improved (Achite et al. 2022). This is because SPI values over long time scales are smoother and feature information is richer. Therefore, to ensure the high accuracy of prediction models, long time scale SPI-12 is decided as short-term meteorological drought prediction index in this study, and the framework of prediction method is shown in Figure 5.
Figure 5

The framework of the prediction method.

Figure 5

The framework of the prediction method.

Close modal
From Figure 5, the experiment setup of meteorological drought prediction models is divided into three aspects: (1) Stand-alone ML prediction models are established to illustrate the challenges be met by prediction of SPI-12 time series, that is, SPI-12 original time series is directly brought into the ML models to complete predict. The calculation of SPI-12 original time series is realized by Matlab code (available at https://www.csdn.net/); (2) EMD, EEMD and VMD are employed to decompose and denoise the complex SPI-12 original time series, extract hidden features in the original series, and build the ‘decomposition-prediction’ models (EMD-ML, EEMD-ML and VMD-ML) to show improvement of data pre-processing on performance. Taking Fengxiang station as an example, its SPI-12 original time series and decomposition results are shown in Figure 6. From Figure 6, for three decomposition algorithms, the original time series SPI-12 is divided into eight subsequences (IFM1 ∼ IMF8) from high frequency to low frequency. Each subsequence contains feature information of the original sequence, so the ML models can learn periodic and regular features of SPI-12 time series more accurately; (3) A single decomposition algorithm has limitations in feature extraction of SPI-12 original time series. Thus, in the third experiment, 24 subsequences generated by three decomposition algorithms are integrated into one set as the model input, and ‘integration-prediction’ models (INT-ML) are constructed to test whether this new method could further improve the prediction accuracy.
Figure 6

SPI-12 original time series and decomposition results of Fengxiang station.

Figure 6

SPI-12 original time series and decomposition results of Fengxiang station.

Close modal

A total of two stand-alone models and eight hybrid models are developed in this research, Table 1 for details. Sample sets of all models are divided into the training set, validation set and testing set with a ratio of about 60%:20%:20%. The hybrid model's prediction strategy is to predict subsequences separately and sum them to get the final prediction result.

Table 1

The experiment setup of meteorological drought prediction models

Experiment setupModel typeModelQuantity
Stand-alone GRU 
Stand-alone LGBM 
Hybrid (decomposition) EMD-GRU 
Hybrid (decomposition) EEMD-GRU 
Hybrid (decomposition) VMD-GRU 
Hybrid (decomposition) EMD-LGBM 
Hybrid (decomposition) EEMD-LGBM 
Hybrid (decomposition) VMD-LGBM 
Hybrid (integration) INT-GRU 
Hybrid (integration) INT-LGBM 
Experiment setupModel typeModelQuantity
Stand-alone GRU 
Stand-alone LGBM 
Hybrid (decomposition) EMD-GRU 
Hybrid (decomposition) EEMD-GRU 
Hybrid (decomposition) VMD-GRU 
Hybrid (decomposition) EMD-LGBM 
Hybrid (decomposition) EEMD-LGBM 
Hybrid (decomposition) VMD-LGBM 
Hybrid (integration) INT-GRU 
Hybrid (integration) INT-LGBM 

Model hyper-parameters optimization

In the process of ML, the internal configuration variables summarized by the model through training data are called model parameters. Beyond that, external variables of the model that are artificially set before ML begins are called model hyper-parameters. The model hyper-parameters are not obtained by system learning. It is necessary to specify a reasonable value range for it according to existing experience, and iteratively search for optimal value to achieve the best performance of the model. This process is called hyper-parameters optimization (Tran et al. 2020).

Based on the Bayesian Optimization (BO) method under the Optuna framework in this study (Akiba et al. 2019), the hyper-parameters of GRU and LGBM models are optimized. BO is a serialization model optimization method, which is usually applied to black box objective function optimization with unknown real distribution or very difficult solutions. Both the objective function and loss function are MSE, and the total number of optimization iterations is 50. Hyper-parameters required to be optimized for GRU and LGBM models and their search range settings are shown in Table 2.

Table 2

The hyper-parameters to be optimized for the models

ModelHyper-parameterValue range
GRU number of hidden layer nodes [16: 32: 256]a 
number of hidden layers [1: 1: 2] 
dropout [0: 0.02: 0.2] 
LGBM max_depth [1: 1: 9] 
gamma [1×10−8, 1]b 
learning_rate [1×10−8, 1] 
alpha [1×10−9, 1×10−3
lambda [1×10−5, 1] 
booster [‘gbtree’, ‘gblinear’, ‘dart’] 
ModelHyper-parameterValue range
GRU number of hidden layer nodes [16: 32: 256]a 
number of hidden layers [1: 1: 2] 
dropout [0: 0.02: 0.2] 
LGBM max_depth [1: 1: 9] 
gamma [1×10−8, 1]b 
learning_rate [1×10−8, 1] 
alpha [1×10−9, 1×10−3
lambda [1×10−5, 1] 
booster [‘gbtree’, ‘gblinear’, ‘dart’] 

aThe value range of [16: 32: 256] means that the lower limit of parameter is 16, the search step is 32 and the upper limit of parameter is 256, which is the same for the first three lines.

bThe value range of [1×10−8, 1] shows the lower and the upper limit of parameters, since the sampling method is logarithmic sampling.

Model input optimization

The developed prediction models input variables consist of different lag times (t, t − 1, t − 2, t − 3…) of SPI-12, and the output variable is set to 1-month lead time (t+1) according to research purpose. The model's input variables are also one of the important parameters for sample generation before prediction, and partial autocorrelation function (PACF) is usually used to determine the specific lag time length (Poornima & Pushpalatha 2019). However, PACF is affected by subjective sift, and it is difficult to achieve optimal performance of the models. In this paper, the lag time length is also brought into the Optuna framework as one of the model hyper-parameters for iterative optimization, the search range is set as minimum length 4, maximum length 64, and step length 4. The prediction accuracy of GRU and LGBM under different lag time lengths is shown in Figure 7.
Figure 7

The prediction accuracy of GRU and LGBM under different lag time lengths.

Figure 7

The prediction accuracy of GRU and LGBM under different lag time lengths.

Close modal

From Figure 7, there are obvious differences in the optimal length of lag time between the two models. The optimal lag time length of GRU model is mainly concentrated around 50, and the longer input lag time length, the more information GRU model can capture for prediction, so the higher accuracy achieve. When lag time length exceeds 50, model prediction accuracy will gradually decrease because of the redundancy of information. The optimal lag time length of the LGBM model is mainly concentrated around 25. As lag time length increases, the effective prediction information learned by the model decreases instead, which leads to the decline in prediction accuracy. Thus, it can be seen that the response of GRU and LGBM to optimal lag time length of the input is inconsistent. Only using PACF to determine model input does not guarantee that it applies to all models, iterative optimization could solve this problem.

Development environment

Python 3.7 is used for programming in this study. The construction of ML models are completed based on sklearn and XGBoost library. Three signal decomposition algorithms adopt the third-party libraries PyEMD, PyEEMD and PyVMD, respectively. Furthermore, when determining the decomposition mode number K of VMD, enumeration method applies to find that decomposition effect is the best and there is no mode mixing phenomenon when K = 8.

The statistical results of performance evaluation metrics on all models testing sets are shown in Table 3, with the best metrics in bold. The observed and predicted time series on testing sets of Pucheng, Fengxiang and Wugong stations are shown in Figures 810.
Table 3

Performance statistics of prediction models

StationModelLead timeNSEMSERMSE
Pucheng GRU t + 1 0.711 0.217 0.466 
EMD-GRU t + 1 0.815 0.138 0.372 
EEMD-GRU t + 1 0.846 0.116 0.341 
VMD-GRU t + 1 0.801 0.149 0.386 
INT-GRU t + 1 0.824 0.132 0.364 
LGBM t + 1 0.767 0.175 0.418 
EMD-LGBM t + 1 0.844 0.116 0.342 
EEMD-LGBM t + 1 0.904 0.071 0.268 
VMD-LGBM t + 1 0.819 0.135 0.368 
INT-LGBM t + 1 0.919 0.06 0.246 
Fengxiang GRU t + 1 0.833 0.186 0.432 
EMD-GRU t + 1 0.887 0.125 0.354 
EEMD-GRU t + 1 0.906 0.105 0.324 
VMD-GRU t + 1 0.878 0.136 0.369 
INT-GRU t + 1 0.912 0.098 0.314 
LGBM t + 1 0.835 0.184 0.429 
EMD-LGBM t + 1 0.872 0.142 0.378 
EEMD-LGBM t + 1 0.942 0.064 0.254 
VMD-LGBM t + 1 0.862 0.154 0.392 
INT-LGBM t + 1 0.953 0.052 0.228 
Wugong GRU t + 1 0.698 0.218 0.467 
EMD-GRU t + 1 0.812 0.135 0.368 
EEMD-GRU t + 1 0.825 0.125 0.354 
VMD-GRU t + 1 0.781 0.157 0.397 
INT-GRU t + 1 0.832 0.121 0.347 
LGBM t + 1 0.732 0.193 0.439 
EMD-LGBM t + 1 0.859 0.101 0.318 
EEMD-LGBM t + 1 0.874 0.091 0.301 
VMD-LGBM t + 1 0.771 0.165 0.406 
INT-LGBM t + 1 0.901 0.072 0.268 
StationModelLead timeNSEMSERMSE
Pucheng GRU t + 1 0.711 0.217 0.466 
EMD-GRU t + 1 0.815 0.138 0.372 
EEMD-GRU t + 1 0.846 0.116 0.341 
VMD-GRU t + 1 0.801 0.149 0.386 
INT-GRU t + 1 0.824 0.132 0.364 
LGBM t + 1 0.767 0.175 0.418 
EMD-LGBM t + 1 0.844 0.116 0.342 
EEMD-LGBM t + 1 0.904 0.071 0.268 
VMD-LGBM t + 1 0.819 0.135 0.368 
INT-LGBM t + 1 0.919 0.06 0.246 
Fengxiang GRU t + 1 0.833 0.186 0.432 
EMD-GRU t + 1 0.887 0.125 0.354 
EEMD-GRU t + 1 0.906 0.105 0.324 
VMD-GRU t + 1 0.878 0.136 0.369 
INT-GRU t + 1 0.912 0.098 0.314 
LGBM t + 1 0.835 0.184 0.429 
EMD-LGBM t + 1 0.872 0.142 0.378 
EEMD-LGBM t + 1 0.942 0.064 0.254 
VMD-LGBM t + 1 0.862 0.154 0.392 
INT-LGBM t + 1 0.953 0.052 0.228 
Wugong GRU t + 1 0.698 0.218 0.467 
EMD-GRU t + 1 0.812 0.135 0.368 
EEMD-GRU t + 1 0.825 0.125 0.354 
VMD-GRU t + 1 0.781 0.157 0.397 
INT-GRU t + 1 0.832 0.121 0.347 
LGBM t + 1 0.732 0.193 0.439 
EMD-LGBM t + 1 0.859 0.101 0.318 
EEMD-LGBM t + 1 0.874 0.091 0.301 
VMD-LGBM t + 1 0.771 0.165 0.406 
INT-LGBM t + 1 0.901 0.072 0.268 
Figure 8

Time series of predicted at Pucheng station (two categories models: GRU and LGBM).

Figure 8

Time series of predicted at Pucheng station (two categories models: GRU and LGBM).

Close modal
Figure 9

Time series of predicted at Fengxiang station (two categories models: GRU and LGBM).

Figure 9

Time series of predicted at Fengxiang station (two categories models: GRU and LGBM).

Close modal
Figure 10

Time series of predicted at Wugong station (two categories models: GRU and LGBM).

Figure 10

Time series of predicted at Wugong station (two categories models: GRU and LGBM).

Close modal

From Table 3, the prediction effect of all stand-alone models show the lowest prediction performance in testing sets of three stations, and the performance of GRU model is lower than LGBM model. The minimum NSE values of GRU and LGBM models appear at Wugong station, which are 0.698 and 0.732, respectively.

By combining EMD, EEMD and VMD methods into stand-alone models, the ‘decomposition-prediction’ models are constructed and compared with the stand-alone models. Among them, the models based on EEMD show better prediction performance, followed by EMD, VMD improves models at the lowest level. The maximum NSE values of EEMD-GRU and EEMD-LGBM appear at Fengxiang station, both of which are above 0.9, indicating that decomposition pre-processing can significantly improve prediction performance.

The INT-ML models proposed in this paper have got satisfactory experiment results, especially INT-LGBM not only has the best prediction performance among all the models of three stations, but also has the smallest error statistics. At Pucheng station, the INT-LGBM model produces NSE of 0.919, 20% higher compared to the LGBM model, the MSE and RMSE are 66 and 41% lower. At Fengxiang station, the INT-LGBM model produces NSE of 0.953, 14% higher compared to the LGBM model, the MSE and RMSE are 72 and 47% lower. At Wugong station, the INT-LGBM model produces NSE as 0.901, 23% higher compared to the LGBM model, the MSE and RMSE are 63 and 39% lower. As for the INT-GRU model, its performance improvement to GRU model is also at the highest level except for Pucheng station.

In general, the prediction accuracy of ML models after data pre-processing has been improved to different degrees, and the prediction performance of newly developed INT-ML models are better than models constructed by a single decomposition algorithm.

From Figures 810, the fitting effect of stand-alone models are at the lowest level, especially when predicting the valley values of time series, nearly half of the prediction results have obvious errors with varying degrees of time shift. After decomposition pre-processing, the fitting degree of hybrid models are greatly improved. The INT-LGBM model has the highest accuracy in predicting peak and valley values, and shows the lowest time shift error. These factors prove the importance of decomposition pre-processing for ML models training, among which the ‘integration-prediction’ models learn feature information in SPI-12 time series most adequately.

To compare so many models more intuitively, some Taylor diagrams are produced to display the accuracy of models, as shown in Figures 1113. In the Taylor diagram, scatter represents the model, radiation lines represent the correlation coefficient, horizontal and vertical axes represent standard deviation, and dotted red lines represent root mean square difference (RMSD). From Figures 1113, the INT-LGBM model ranks first in terms of prediction performance for 1 month lead time, with correlation coefficient exceeding 0.95 at all three stations, and performs well in RMSD and standard deviation. The correlation coefficient of GRU and LGBM is at the end of all developed models, but they can still reach over 0.8, even more than 0.9 at Fengxiang station, meaning that the stand-alone models could also achieve successful prediction, although its accuracy is not as good as the hybrid models. The prediction performance of EEMD type hybrid models is higher than EMD and VMD type hybrid models, its advantages can be seen from the good consistency of smaller RMSD between observed and predicted time series.
Figure 11

Taylor diagram of model performance at Pucheng station (two categories models: GRU and LGBM).

Figure 11

Taylor diagram of model performance at Pucheng station (two categories models: GRU and LGBM).

Close modal
Figure 12

Taylor diagram of model performance at Fengxiang station (two categories models: GRU and LGBM).

Figure 12

Taylor diagram of model performance at Fengxiang station (two categories models: GRU and LGBM).

Close modal
Figure 13

Taylor diagram of model performance at Wugong station (two categories models: GRU and LGBM).

Figure 13

Taylor diagram of model performance at Wugong station (two categories models: GRU and LGBM).

Close modal
The absolute value of model prediction error (AVE) is explained by a scatter violin diagram, as shown in Figure 14. From Figure 14, the error fluctuation range of hybrid models is smaller than stand-alone models, and the error distribution is more concentrated around 0. The maximum error values of stand-alone models GRU and LGBM appear at Pucheng station and Wugong station, respectively. The INT-LGBM model reveals the best efficacy in the error range and data distribution, illustrating that the INT-LGBM model has a more stable prediction performance.
Figure 14

Scatter violin diagram of the model's AVE (two categories models: GRU and LGBM).

Figure 14

Scatter violin diagram of the model's AVE (two categories models: GRU and LGBM).

Close modal

Based on the above analysis of experimental results, we could conclude that GRU and LGBM stand-alone models have good potential in predicting short-term meteorological drought in the Guanzhong region, and LGBM has better comprehensive prediction ability than GRU. This result further confirms that ML models have more advantages than linear models in capturing the nonlinear relationship of drought index time series, and also indicates that different models have different learning abilities (Lima et al. 2013; Xu et al. 2020). In the process of practical application, at least two stand-alone models should be established for selection (Khosravi et al. 2017). In addition, the prediction performance of stand-alone models will be significantly improved after the input data decomposition pre-processing, which is also consistent with the research conclusion of current scholars, that is, the ‘decomposition-prediction’ hybrid models can obtain higher prediction accuracy (Başakın et al. 2021; Wang et al. 2021). However, the hybrid models constructed by using a single decomposition algorithm have certain limitations. To reduce the adverse effect of this limitation on the models, a new ‘integration-prediction’ hybrid model construction method is proposed in this study. After comprehensive comparison with the developed stand-alone models and ‘decomposing-prediction’ models, we find that the ‘integration-prediction’ model performs best among all the models, because it fuses more valuable feature information, enhances the learning ability of ML models, further improves the simulation effect and provides a new way for drought prediction research.

ML technology has been widely used in hydrological modeling, especially in predicting complex nonlinear and non-stationary drought index time series. In this research, a new type ‘integration-prediction’ model is proposed, which deeply combines the signal decomposition algorithms with GRU and LGBM models, and applies BO to adjust the model hyper-parameters. The developed INT-GRU and INT-LGBM models have predicted SPI-12 of three stations in the Guanzhong region for the next one month, and compared with stand-alone models and ‘decomposition-prediction’ models. The following conclusions can be drawn:

  • (1)

    Both ML models GRU and LGBM successfully predict SPI-12 time series, and LGBM has better prediction ability than GRU. The experiment results also verify that signal decomposition algorithms as data pre-processing tools can greatly improve the prediction performance of ML models, among which EEMD has the largest improvement, followed by EMD and VMD.

  • (2)

    When there are multiple decomposition algorithms, besides comparing and choosing the best between them, we can integrate all decomposition results to reduce the limitations generated by a single decomposition algorithm.

  • (3)

    An ‘integration-prediction’ model construction method is proposed. Compared with existing stand-alone models and ‘decomposition-prediction’ models, the ‘integration-prediction’ model INT-LGBM produces higher prediction accuracy, smaller prediction error and better stability.

  • (4)

    This study provides a new idea for drought index prediction in the Guanzhong region. In future work, we will select more arid regions to further prove the universal adaptability of ‘integration-prediction’ models, and to offer a reliable reference for decision makers.

This work is supported by the National Natural Science Foundation of China (No. 52209035) and Xi'an University of Technology (No. 256082016).

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abbot
J.
&
Marohasy
J.
2014
Input selection and optimisation for monthly rainfall forecasting in Queensland, Australia, using artificial neural networks
.
Atmospheric Research
138
,
166
178
.
doi:10.1016/j.atmosres.2013.11.002
.
Achite
M.
,
Banadkooki
F. B.
,
Ehteram
M.
,
Bouharira
A.
,
Ahmed
A. N.
&
Elshafie
A.
2022
Exploring Bayesian model averaging with multiple ANNs for meteorological drought forecasts
.
Stochastic Environmental Research and Risk Assessment
36
,
1835
1860
.
doi:10.1007/s00477-021-02150-6
.
Achour
K.
,
Meddi
M.
,
Zeroual
A.
,
Bouabdelli
S.
,
Maccioni
P.
&
Moramarco
T.
2020
Spatio-temporal analysis and forecasting of drought in the plains of northwestern Algeria using the standardized precipitation index
.
Journal of Earth System Science
129
(
1
),
1
22
.
doi:10.1007/s12040-019-1306-3
.
Akiba
T.
,
Sano
S.
,
Yanase
T.
,
Ohta
T.
&
Koyama
M.
2019
Optuna: a next-generation hyperparameter optimization framework
. In
Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
, pp.
2623
2631
.
doi:10.1145/3292500.3330701
.
Almikaeel
W.
,
Čubanová
L.
&
Šoltész
A.
2022
Hydrological drought forecasting using machine learning – Gidra river case study
.
Water
14
(
3
),
387
.
doi:10.3390/w14030387
.
Başakın
E. E.
,
Ekmekcioğlu
Ö.
&
Özger
M.
2021
Drought prediction using hybrid soft-computing methods for semi-arid region
.
Modeling Earth Systems and Environment
7
,
2363
2371
.
doi:10.1007/s40808-020-01010-6
.
Belayneh
A.
,
Adamowski
J.
,
Khalil
B.
&
Ozga-Zielinski
B. J. J. O. H.
2014
Long-term SPI drought forecasting in the Awash River Basin in Ethiopia using wavelet neural network and wavelet support vector regression models
.
Journal of Hydrology
508
,
418
429
.
doi:10.1016/j.jhydrol.2013.10.052
.
Citakoglu
H.
&
Coşkun
Ö
.
2022
Comparison of hybrid machine learning methods for the prediction of short-term meteorological droughts of Sakarya Meteorological station in Turkey
.
Environmental Science and Pollution Research
29
,
75487
75511
.
doi:10.1007/s11356-022-21083-3
.
Dai
M.
,
Huang
S. Z.
,
Huang
Q.
,
Leng
G. Y.
,
Guo
Y.
,
Wang
L.
,
Fang
W.
,
Li
P.
&
Zheng
X. D.
2020
Assessing agricultural drought risk and its dynamic evolution characteristics
.
Agricultural Water Management
231
,
106003
.
doi:10.1016/j.agwat.2020.106003
.
Danandeh Mehr
A.
,
Rikhtehgar Ghiasi
A.
,
Yaseen
Z. M.
,
Sorman
A. U.
&
Abualigah
L.
2022
A novel intelligent deep learning predictive model for meteorological drought forecasting
.
Journal of Ambient Intelligence and Humanized Computing
1
15
.
doi:10.1007/s12652-022-03701-7
.
Deo
R. C.
,
Salcedo-Sanz
S.
,
Carro-Calvo
L.
&
Saavedra-Moreno
B.
2018
Drought prediction with standardized precipitation and evapotranspiration index and support vector regression models
.
Integrating Disaster Science and Management
151
174
.
doi:10.1016/b978-0-12-812056-9.00010-5
.
Djerbouai
S.
&
Souag-Gamane
D.
2016
Drought forecasting using neural networks, wavelet neural networks, and stochastic models: case of the Algerois Basin in North Algeria
.
Water Resources Management
30
,
2445
2464
.
doi:10.1007/s11269-016-1298-6
.
Dragomiretskiy
K.
&
Zosso
D.
2014
Variational mode decomposition
.
IEEE Transactions on Signal Processing
62
(
3
),
531
544
.
doi:10.1109/TSP.2013.2288675
.
He
J.
,
Yang
X. H.
,
Li
J. Q.
,
Jin
J. L.
,
Wei
Y. M.
&
Chen
X. J.
2014
Spatiotemporal variation of meteorological droughts based on the daily comprehensive drought index in the Haihe River basin, China
.
Natural Hazards
75
(
2
),
199
217
.
doi:10.1007/s11069-014-1158-8
.
Huang
N. E.
,
Shen
Z.
,
Long
S. R.
,
Wu
M. C.
,
Shih
H. H.
,
Zheng
Q.
,
Yen
N.
,
Tung
C. C.
&
Liu
H. H.
1998
The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis
.
Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences
454
(
1971
),
903
995
.
doi:10.1098/rspa.1998.0193
.
Ke
G. L.
,
Meng
Q.
,
Finley
T.
,
Wang
T. F.
,
Chen
W.
,
Ma
W. D.
,
Ye
Q. W.
&
Liu
T. Y.
2017
LightGBM: a highly efficient gradient boosting decision tree
. In
Proceedings of the 31st International Conference on Neural Information Processing Systems
, pp.
3146
3154
.
Khan
M. M. H.
,
Muhammad
N. S.
&
El-Shafie
A.
2020
Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting
.
Journal of Hydrology
590
,
125380
.
doi:10.1016/j.jhydrol.2020.125380
.
Khosravi
I.
,
Jouybari-Moghaddam
Y.
&
Sarajian
M. R.
2017
The comparison of NN, SVR, LSSVR and ANFIS at modeling meteorological and remotely sensed drought indices over the eastern district of Isfahan, Iran
.
Natural Hazards
87
,
1507
1522
.
doi:10.1007/s11069-017-2827-1
.
Kisi
O.
,
Gorgij
A. D.
,
Zounemat-Kermani
M.
,
Mahdavi-Meymand
A.
&
Kim
S.
2019
Drought forecasting using novel heuristic methods in a semi-arid environment
.
Journal of Hydrology
578
,
124053
.
doi:10.1016/j.jhydrol.2019.124053
.
Li
H. W.
,
Li
Z.
,
Chen
Y. N.
,
Xiang
Y. Y.
,
Liu
Y. C.
,
Kayumba
P. M.
&
Li
X. Y.
2021
Drylands face potential threat of robust drought in the CMIP6 SSPs scenarios
.
Environmental Research Letters
16
(
11
),
114004
.
doi:10.1088/1748-9326/ac2bce
.
Lima
A. R.
,
Cannon
A. J.
&
Hsieh
W. W.
2013
Nonlinear regression in environmental sciences by support vector machines combined with evolutionary strategy
.
Computers & Geosciences
50
,
136
144
.
doi:10.1016/j.cageo.2012.06.023
.
Malik
A.
,
Tikhamarine
Y.
,
Sammen
S. S.
,
Abba
S. I.
&
Shahid
S.
2021
Prediction of meteorological drought by using hybrid support vector regression optimized with HHO versus PSO algorithms
.
Environmental Science and Pollution Research
28
,
39139
39158
.
doi:10.1007/s11356-021-13445-0
.
McKee
T. B.
,
Doesken
N. J.
&
Kleist
J.
1993
The relationship of drought frequency and duration to time scales
. In
Proceedings of the 8th Conference on Applied Climatology
, Vol.
17
, No.
22
, pp.
179
183
.
Mehta
D.
&
Yadav
S. M.
2021
An analysis of rainfall variability and drought over Barmer District of Rajasthan, Northwest India
.
Water Supply
21
(
5
),
2505
2517
.
doi:10.2166/ws.2021.053
.
Mei
P.
,
Liu
J. H.
,
Liu
C. Z.
&
Liu
J. N.
2022
A deep learning model and its application to predict the monthly MCI drought index in the Yunnan Province of China
.
Atmosphere
13
(
12
),
1951
.
doi:10.3390/atmos13121951
.
Mishra
A. K.
&
Singh
V. P.
2010
A review of drought concepts
.
Journal of Hydrology
391
(
1–2
),
202
216
.
doi:10.1016/j.jhydrol.2010.07.012
.
Palmer
W. C.
1965
Meteorological Drought
, Vol.
30
, No.
45
.
US Department of Commerce Weather Bureau
, Washington, DC, p.
58
.
Poornima
S.
&
Pushpalatha
M.
2019
Drought prediction based on SPI and SPEI with varying timescales using LSTM recurrent neural network
.
Soft Computing
23
(
18
),
8399
8412
.
doi:10.1007/s00500-019-04120-1
.
Roushangar
K.
,
Ghasempour
R.
&
Nourani
V.
2021
The potential of integrated hybrid pre-post-processing techniques for short-to long-term drought forecasting
.
Journal of Hydroinformatics
23
(
1
),
117
135
.
doi:10.2166/hydro.2020.088
.
Song
Z. H.
,
Xia
J.
,
She
D. X.
,
Zhang
L. P.
,
Hu
C.
&
Zhao
L.
2020
The development of a Nonstationary Standardized Precipitation Index using climate covariates: a case study in the middle and lower reaches of Yangtze River Basin, China
.
Journal of Hydrology
588
,
125115
.
doi:10.1016/j.jhydrol.2020.125115
.
Tran
N.
,
Schneider
J. G.
,
Weber
I.
&
Qin
A. K.
2020
Hyper-parameter optimization in classification: to-do or not-to-do
.
Pattern Recognition
103
,
107245
.
doi:10.1016/j.patcog.2020.107245
.
Van-Rooy
M. P.
1965
A rainfall anomaly index (RAI), independent of the time and space
.
Notos
14
,
43
48
.
Vicente-Serrano
S. M.
,
Beguería
S.
&
López-Moreno
J. I.
2010
A multiscalar drought index sensitive to global warming: the standardized precipitation evapotranspiration index
.
Journal of Climate
23
(
7
),
1696
1718
.
doi:10.1175/2009JCLI2909.1
.
Wang
F.
,
Wang
Z. M.
,
Yang
H. B.
,
Di
D. Y.
,
Zhao
Y.
&
Liang
Q. H.
2020
A new copula-based standardized precipitation evapotranspiration streamflow index for drought monitoring
.
Journal of Hydrology
585
,
124793
.
doi:10.1016/j.jhydrol.2020.124793
.
Wang
X. J.
,
Wang
Y. P.
,
Yuan
P. X.
,
Wang
L.
&
Cheng
D. L.
2021
An adaptive daily runoff forecast model using VMD-LSTM-PSO hybrid approach
.
Hydrological Sciences Journal
66
(
9
),
1488
1502
.
doi:10.1080/02626667.2021.1937631
.
Wu
Z.
&
Huang
N. E.
2009
Ensemble empirical mode decomposition: a noise-assisted data analysis method
.
Advances in Adaptive Data Analysis
1
(
01
),
1
41
.
doi:10.1142/S1793536909000047
.
Xu
D. H.
,
Zhang
Q.
,
Ding
Y.
&
Huang
H. P.
2020
Application of a hybrid ARIMA–SVR model based on the SPI for the forecast of drought – a case study in Henan Province, China
.
Journal of Applied Meteorology and Climatology
59
(
7
),
1239
1259
.
doi:10.1175/JAMC-D-19-0270.1
.
Yu
W. J.
,
Shao
M. Y.
,
Ren
M. L.
,
Zhou
H. J.
,
Jiang
Z. H.
&
Li
D. L.
2013
Analysis on spatial and temporal characteristics drought of Yunnan Province
.
Acta Ecologica Sinica
33
(
6
),
317
324
.
doi:10.1016/j.chnaes.2013.09.004
.
Zhang
A. Z.
&
Jia
G. S.
2013
Monitoring meteorological drought in semiarid regions using multi-sensor microwave remote sensing data
.
Remote Sensing of Environment
134
,
12
23
.
doi:10.1016/j.rse.2013.02.023
.
Zuo
G. G.
,
Luo
J. G.
,
Wang
N.
,
Lian
Y. N.
&
He
X. X.
2020
Two-stage variational mode decomposition and support vector regression for streamflow forecasting
.
Hydrology and Earth System Sciences
24
(
11
),
5491
5518
.
doi:10.5194/hess-24-5491-2020
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).