Drought is a serious natural disaster that causes huge losses to various regions of the world. To effectively cope with this disaster, we need to use drought indices to classify and compare the drought conditions of different regions. We can take appropriate measures according to the category of drought to mitigate the impact of drought. Recently, deep learning models have shown promising results in this domain. However, few of these models consider the relationships between different areas, which limits their ability to capture the complex spatio-temporal dynamics of droughts. In this study, we propose a novel multivariate spatio-temporal sensitive network (MSTSN) for drought prediction, which incorporates both geographical and temporal knowledge in the network and improves its predictive power. We obtained the standardized precipitation evapotranspiration index and meteorological data from the climatic research unit dataset, covering the period from 1961 to 2018. This is the first deep learning method that embeds geographical knowledge in drought prediction. We also provide a solid foundation for comparing our method with other deep learning baselines and evaluating their performance. Experiments show that our method consistently outperforms the existing state-of-the-art methods on various metrics, validating the effectiveness of geospatial and temporal information.

  • A novel multivariate spatio-temporal sensitive network (MSTSN) is proposed for drought prediction.

  • The MSTSN model captures the relationships between different areas by graph neural network.

  • The MSTSN model extracts the long-term dependencies of data through a gated recurrent unit and enhances its effect by multi-head self-attention.

  • The MSTSN model predicts the spatial drought distribution closest to the actual distribution.

Droughts develop slowly and affect vast regions over time, causing severe social and economic damage. Droughts in China have become more frequent, severe, and widespread in the last six decades due to global climate change. From 1950 to 2015, the cumulative disaster area in Henan Province was 113,796,627 acres, and the average disaster area over the years was 1,724,191 acres. As the main grain-producing area in China, drought not only impacts the economic development of Henan Province but also affects the national food security when the drought is severe. Therefore, we choose Henan Province as the focus of this study to improve the ability of drought prediction and help decision-makers make informed decisions, which is of great significance in reducing agricultural economic loss.

Previous studies have developed various indices to capture the different features of drought. Palmer (1965) developed a drought index called the Palmer drought severity index (PDSI) using precipitation and temperature data, which is still in use today. McKee et al. (1993) developed the standard precipitation index (SPI), and in 2009, the World Meteorological Organization recommended the index to other countries around the world. Currently, the SPI is widely used globally. The standardized precipitation evapotranspiration index (SPEI) was developed by Vicente-Serrano et al. (2010). Compared to SPI, SPEI includes temperature information by introducing potential evapotranspiration. Moreover, its different time scales can characterize different types of drought. SPEI is extensively used in drought assessment and forecasting. According to the National Center for Environmental Information (NCEI), drought classification converts a large amount of drought index into a category representing drought severity. Interpreting the drought category is simpler than interpreting drought index values and can help stakeholders easily understand. Therefore, decision-makers are most concerned with the category of drought rather than the values of the drought indices (Bazrkar & Chu 2022).

Drought prediction research can be broadly categorized into physical models, statistical models, and deep learning models. The physical processes of the land, ocean, and atmosphere are simulated by physical models. However, their accuracy in forecasting precipitation on a monthly or seasonal scale limits their ability to predict drought (Deo & Şahin 2015). Statistical models use various influencing factors as predictors to analyze relationships between historical records. This includes various techniques like regression (Li et al. 2020b), time series analysis (Han et al. 2010), and machine learning approaches (Komasi et al. 2018; Fung et al. 2020; Ma et al. 2022). These models are widely used because of their simplicity in structure, small data requirements, and low computational costs. However, they fail to adapt to the non-stationary nature of drought estimation, and they are prone to overfitting because of the lagged terms in the time series data. Therefore, we should explore the potential of more advanced deep learning methods.

Deep learning models are increasingly used for drought forecasting. One of the most popular models is the recurrent neural network (RNN) (Le et al. 2017), which can handle sequential data. However, RNNs struggle to capture long-range dependencies in time series data, which are important for drought prediction. To overcome this limitation, long short-term memory (LSTM) (Poornima & Pushpalatha 2019) and gated recurrent unit (GRU) (Zhou et al. 2022; Yan et al. 2023) models have been developed, which can retain information over longer periods. However, these models may face the problem of gradient vanishing or gradient explosion. Transformer (Vaswani et al. 2017) solves their limitations in handling long-term dependencies and capturing long sequences. Another type of model that can be useful for time series classification is the convolutional neural network (CNN) (Ham et al. 2019), which can extract features from different dimensions and reduce model complexity. Fully convolutional networks (FCNs) (Wang et al. 2017) are a variant of CNNs that achieve high performance in time series classification tasks. A recent model that combines LSTM and FCN with a Squeeze-and-Excitation Block is MLSTM-FCN (Karim et al. 2019), which can learn the relationships between different features at each time step. However, these models all ignore the spatial structure of large regions, which can affect drought patterns. Graph neural networks (GNNs) (Scarselli et al. 2008) are a class of models that can exploit the spatial structure within a region and have shown success in time series prediction tasks such as crop yield prediction (Fan et al. 2022) and COVID forecasting (Kapoor et al. 2020), traffic flow prediction (Lan et al. 2022). Common graph neural network (GNN) models include graph convolutional network (GCN) (Bhatti et al. 2023), graph attention network (GAT) (Chen et al. 2023), GraphSAGE (Liu et al. 2023), etc. To enhance the performance of these models, the attention mechanism (Li et al. 2020a) was introduced. It is a technique that enables a neural network to selectively attend to the most significant portions of the input and output data, while disregarding less relevant parts. Vaswani et al. (2017) introduced a compelling enhancement to this mechanism known as multi-head self-attention, which allows the network to further refine its focus by attending to multiple subspaces of the input and output sequences simultaneously. It has several benefits over single-head attention, such as capturing various dependencies and facilitating interpretability.

Drought is a natural disaster that threatens humans and ecosystems. It leads to vegetation loss, lake shrinkage, land subsidence, seawater intrusion, and biodiversity decline (Młyński et al. 2021). Henan Province is a major agricultural region in China and is vulnerable to drought. Accurate prediction of drought severity in different areas of Henan Province is essential for developing effective and sustainable adaptation strategies, which can prevent water and food scarcity and reduce economic losses (Adnan et al. 2023). This study use the Climate Research Unit (CRU TS v4.03) (Harris et al. 2020) dataset from 1961 to 2018, with a spatial resolution of 0.5° × 0.5°. Henan Province has 69 grid points, numbered from 1 to 69 in a top-to-bottom, left-to-right order. Existing methods treat each grid point as an independent region, which may not fully use the spatial structure of a larger region. Figure 1 shows that adjacent grid points have a strong correlation in drought, while non-adjacent grid points have significant differences, violating the independence assumption. To improve the prediction ability of drought severity and address the limitations of previous methods that neglect the geographical knowledge of each region, this study proposes a novel multivariate spatio-temporal sensitive network (MSTSN) for drought prediction. This method has two modules: the spatial aware module (SAM) and the temporal enhanced module (TEM). The SAM uses GNN to combine the features from neighboring grid points with its own features to boost the predictive power, while the TEM extracts temporal information of the aggregated features. By integrating geospatial and temporal information, our model captures the complex spatio-temporal dynamics of droughts more effectively than existing deep learning models. To our knowledge, our work is the first to incorporate geographical knowledge into drought prediction. The main contributions and objectives of this paper are: (1) To propose a novel MSTSN for drought prediction, which can effectively capture the complex spatio-temporal dynamics of droughts and improve the prediction ability of drought severity; (2) To introduce geographical knowledge into drought prediction for the first time, by using GNN to fuse the features from neighboring grid points with its own features, fully utilizing the spatial structure information among regions; (3) To conduct a comprehensive experimental evaluation, comparing with four deep learning models, and visualizing the prediction results of each model, analyzing the evolution of drought in Henan Province from 2015 to 2018.
Figure 1

(a) SPEI-6 values of the first grid point and its adjacent grid points (2,5,6) from 2001 to 2014. (b) SPEI-6 values of the first grid point and the grid point farthest (69) from it from 2001 to 2014.

Figure 1

(a) SPEI-6 values of the first grid point and its adjacent grid points (2,5,6) from 2001 to 2014. (b) SPEI-6 values of the first grid point and the grid point farthest (69) from it from 2001 to 2014.

Close modal
As shown in Figure 2, Henan province is located in the central-east region of China, covering an area between 110°21′–116°39′E and 31°23′–36°22′N. This province falls under the warm temperate and subtropical zone, and it experiences a humid and semi-humid monsoon climate. The region is characterized by four distinct seasons, with a cold and dry winter, a windy and dusty spring, a hot and rainy summer, and a clear and sunny autumn. In this province, daily sunlight varies from 2,000 to 2,600 h throughout the year. The mean annual temperature fluctuates between 12 and 16 °C, while the highest and lowest recorded temperatures are 44.2 and −21.7 °C, respectively. A frost-free period lasting from 180 to 240 days is observed, and the yearly rainfall measures between 500 and 900 mm, with a gradual decline from the southeast to the northwest. The precipitation distribution in Henan is uneven due to the monsoon, with half of the annual rainfall concentrated in the summer season, often accompanied by heavy rains. Henan is traversed by four major rivers, namely the Yellow River, the Huaihe River, the Haihe River, and the Yangtze River.
Figure 2

Location of the study area.

Figure 2

Location of the study area.

Close modal

The dataset used in this study is the climatic research unit (CRU) dataset, which was developed by the University of East Anglia. The dataset features a spatial resolution of 0.5° × 0.5° and comprises 11 variables, including cloud cover, frost day frequency, potential evapotranspiration, rainfall, diurnal temperature range, relative humidity, daily mean temperature, monthly average daily maximum temperature, vapor pressure, monthly average daily minimum temperature, and wet day frequency. This dataset has been utilized for a broad range of applications, such as climate variability, agronomic research (Renard & Tilman 2019), and paleo-climatic studies (Nagavciuc et al. 2019). We use ArcMap (Wadwekar & Kapshe 2023) to filter the dataset for Henan Province's meteorological data, based on its boundary vector file. The resulting dataset contains meteorological data for 69 grid points, which is illustrated by the red dots in Figure 2.

In this study, we use the standardized precipitation evapotranspiration index (SPEI) as the drought index for prediction. This index takes into account not only the statistical distribution of precipitation but also potential surface evapotranspiration, providing a more comprehensive reflection of regional drought conditions. The specific calculation process is described in Vicente-Serrano et al. (2010). The SPEI values are available at multiple time scales, including 1, 3, 6, 9, 12, and 24 months, and different time scales can characterize different types of drought. Generally, shorter time scales are typically used to assess meteorological drought, while medium time scales are commonly utilized to evaluate agricultural drought. Longer time scales are more appropriate for describing hydrological drought. The CRU dataset provides SPEI values at different time scales globally. The drought categories, which are divided into drought categories according to the grades of meteorological drought1, are presented in Table 1.

Table 1

Drought category as per SPEI values

CategoryDescriptionSPEI classifications
No drought > − 0.5 
Mild drought [−1.0, − 0.5] 
Moderate drought [−1.5, − 1.0) 
Severe drought [−2.0, − 1.5) 
Extreme drought < − 2.0 
CategoryDescriptionSPEI classifications
No drought > − 0.5 
Mild drought [−1.0, − 0.5] 
Moderate drought [−1.5, − 1.0) 
Severe drought [−2.0, − 1.5) 
Extreme drought < − 2.0 

In this section, we present a novel MSTSN for drought prediction, which consists of two modules: the SAM and the TEM, as illustrated in Figure 3. The SAM aggregates information from adjacent grid points to obtain a new feature matrix with spatial information. Subsequently, the new feature matrix is input into the TEM to extract temporal features. Finally, the extracted features are passed through a softmax layer to generate the final drought prediction results for each grid point. Our proposed approach offers a robust and efficient framework for modeling the complex spatio-temporal relationships underlying drought patterns, leading to improved prediction accuracy.
Figure 3

Structure of MSTSN.

Figure 3

Structure of MSTSN.

Close modal

In drought prediction, we denote each grid point's by and ground-truth drought category by , where c, t represent grid point and month, respectively. Each contains 11 weather features, detailed symbol explanations are shown in Table 2.

Table 2

List of symbols

SymbolsMeanings
 the c-th grid point at time t
A symmetric adjacency matrix 
N,M number of grid points, number of samples 
 the aggregated embedding of c-th grid point at time t 
 GNN embedding of c-th grid point at time t 
,,, the values of the reset gate, the values of the update gate, the current candidate state, the current hidden state 
 the activation function 
,, the learnable weight matrices for the i-th attention head 
Q,K,V query matrix, key matrix, value matrix 
,,  the real drought category of the c-th grid point at time t, the final output of the TEM module, the predicted distribution of the c-th grid point at t-th month 
GNN Graph neural network 
GRU Gated recurrent unit 
SPEI Standardized precipitation evapotranspiration index 
CRU Climate research unit 
LSTM Long short-term memory 
FCN Fully convolutional network 
CNN Convolutional neural network 
ROC-AUC Receiver operating characteristic – area under the curve 
SymbolsMeanings
 the c-th grid point at time t
A symmetric adjacency matrix 
N,M number of grid points, number of samples 
 the aggregated embedding of c-th grid point at time t 
 GNN embedding of c-th grid point at time t 
,,, the values of the reset gate, the values of the update gate, the current candidate state, the current hidden state 
 the activation function 
,, the learnable weight matrices for the i-th attention head 
Q,K,V query matrix, key matrix, value matrix 
,,  the real drought category of the c-th grid point at time t, the final output of the TEM module, the predicted distribution of the c-th grid point at t-th month 
GNN Graph neural network 
GRU Gated recurrent unit 
SPEI Standardized precipitation evapotranspiration index 
CRU Climate research unit 
LSTM Long short-term memory 
FCN Fully convolutional network 
CNN Convolutional neural network 
ROC-AUC Receiver operating characteristic – area under the curve 

Spatial aware module

As shown in Figure 1, it can be seen that geographically adjacent grid points have a strong correlation in drought. Intuitively, if some grid points have experienced severe drought, nearby grid points tend to have similar situations. Incorporating relevant information from the neighboring grid points can potentially enhance the accuracy of the prediction if appropriately integrated. Previous studies have used convolutional neural networks (CNNs) to extract spatial features from structured images, but they are not suitable for our problem, where the grid locations form an unstructured graph with irregular node arrangements. Graph neural networks (GNNs) are a recent class of neural networks designed to handle complex dependencies that exist within graph-structured data sources. With GNNs, there is greater flexibility and a broader range of representation space available to encode node and edge information from the graph, thereby facilitating more effective inference. Formally, a graph can be represented as G=(V, E), where V denotes the collection of nodes and E represents the connections between them. In this drought prediction task, each node is a grid point. E is represented as a symmetric adjacency matrix, where if two grid points adjacent and otherwise. N is the total number of grid points. For each month, there is an associated value of for each node.

GraphSAGE is a widely used GNN model that employs node feature information to learn node embeddings via neighborhood aggregation. Unlike other techniques that rely on matrix factorization and normalization, GraphSAGE simply aggregates features from a node's local neighborhood, resulting in lower computational requirements. The model is highly adaptable since it can use features from different numbers of hops or search depths, leading to better generalization. As the adjacency matrix is sparse due to the fact that a majority of grid points have only a few neighboring points, GraphSAGE is a suitable approach for drought prediction.

Formally, for the l-th layer of GraphSAGE,
formula
(1)
where , and . is the collection of neighboring grid points for c. The function that aggregates the l-th layer is represented by, which can be pooling, graph convolution, or mean function. Through experiments, we find that mean aggregation is an effective method in GraphSAGE. It averages the features of each node's neighbors to obtain the aggregated feature. This allows the node to capture the overall features of the surrounding nodes. Mean aggregation is simple but powerful in learning the spatial structure of the graph. is the aggregated embedding from the bordering grid points. We concatenate with the last layer's before the transformation using . is a non-linear function.

Temporal enhanced module

To analyze drought patterns and cycles, it is essential to extract temporal features from historical knowledge. We propose a TEM that captures the temporal dynamics of the output embedding from GNN. This module consists of two components: (1) GRU models long-term dependencies and trends over time and (2) multi-head self-attention mechanism enhances its effect by assigning a weight to each month.

Gated recurrent unit

To handle long-term dependencies and capture the relationship between time steps, we input the output embeddings of each month into GRU for temporal feature extraction. These output embeddings are from the last layer of GNN, which extracts the information from the whole local neighborhood.
formula
(2)
where is the GNN embedding from month .
The GRU model is a variant of RNN. It effectively addresses the gradient vanishing and gradient exploding issues presented in traditional RNNs by introducing reset gates and update gates. This enhances the model's ability to handle long sequences. The structure of it is shown in Figure 4. Compared to LSTM, it has fewer parameters and computational requirements, making it faster and more suitable for processing large-scale sequential data. At each time step, it calculates the values of the reset gate and update gate based on the current input and the previous hidden state . The reset gate determines which historical information should be retained, while the update gate determines the degree to which new input information affects the current state. Subsequently, based on the results of the gating, it calculates the candidate state for the current time step, which represents the possible hidden state for the current time step. Finally, it updates the hidden state based on the value of the update gate to obtain the hidden state for the next time step . Through this gating mechanism, this model can dynamically determine which historical information should be forgotten and which new information should be retained and can effectively capture the long-term dependency between sequential data in long sequences. Specifically, the calculation formula for the GRU is:
formula
(3)
where is the current input, is the previous hidden state, and are the values of the reset gate and update gate, respectively. represents the product of matrices, is the activation function, is the current candidate state, and is the current hidden state.
Figure 4

Structure of GRU network.

Figure 4

Structure of GRU network.

Close modal

Multi-head self-attention mechanism

In this task, the meteorological characteristics of the input months have varying degrees of influence on the prediction results. For instance, the impact of adjacent months' meteorological factors on the prediction results is likely more significant than that of non-adjacent months. Therefore, by incorporating a multi-head self-attention mechanism, the model can focus on the characteristics of important moments.

We concatenate the outputs of the GRU and input them to the multi-head self-attention layer along the variable dimension. The multi-head self-attention mechanism consists of several attention heads, each of which learns a separate representation of the input. Specifically, the input is transformed linearly into a query matrix, a key matrix, and a value matrix, which are then used to compute attention scores between each pair of positions in the sequence. The attention scores are normalized using a softmax function and used to weight the value matrix, which is then summed to obtain the output of the attention head.
formula
(4)
where hidden_size represents the hidden dimensions of GRU.

The output of each attention head is concatenated and transformed linearly again to obtain the final output of the multi-head self-attention mechanism. This allows the model to attend to different aspects of the input in parallel and capture different types of information.

The multi-head self-attention mechanism can be expressed as follows:
formula
(5)
where Q, K, and V represent the query matrix, key matrix, and value matrix, respectively, and h denotes the number of attention heads utilized in the multi-head self-attention mechanism, is the output weight matrix. Each attention is defined as:
formula
(6)
where , , and are the learnable weight matrices for the i-th attention head. The function of Attention is defined as:
formula
(7)
where is the dimension of the key matrix K.
The feature matrix from the multi-head self-attention layer is input into a global average pooling layer to obtain a compressed feature matrix, and the final output of the TEM module can be represented as:
formula
(8)

Drought prediction

The softmax layer takes the output of the TEM module as input and obtains a vector of probabilities for each class. The category with the highest predicted probability will be the final prediction result. This can be expressed as follows:
formula
(9)
where is the predicted distribution of the c-th grid point at t-th month.
The cross-entropy loss is a popular choice for training multi-class classification models. The loss function measures the dissimilarity between the actual probability distribution and the predicted probability distribution. The loss is minimized when the predicted distribution is close to the actual distribution. The cross-entropy loss for a single example i can be computed as follows:
formula
(10)
where M is the number of samples, is the true probability distribution of the j-th grid point at t-th month.

Performance metrics

To evaluate the predictive performance of each model, we utilize four evaluation metrics: precision, recall, F1 score, and accuracy. Additionally, we employ a multi-class receiver operating characteristic–area under the curve (ROC-AUC) for each classifier, providing a better illustration of their performance.

For binary classification, we use a 2 × 2 confusion matrix to show the prediction results of a classifier, and Table 3 displays true positive (TP), true negative (TN), false negative (FN), and false positive (FP).

Table 3

The confusion matrix of a binary classifier

Prediction outcome
10
Actual value True positive (TP) False negative (FN) 
False positive (FP) True negative (TN) 
Prediction outcome
10
Actual value True positive (TP) False negative (FN) 
False positive (FP) True negative (TN) 

Accuracy, Precision, Recall, and F1 score are defined as follows:
formula
(11)
formula
(12)
formula
(13)
formula
(14)

Given that this study involves multiple categories, the performance metrics can be computed independently for each category. Specifically, samples from a specific category are treated as positive, while the remaining categories are considered negative. The final result is obtained by averaging the metrics across all categories. To account for the imbalanced distribution of the dataset, a macro approach is utilized to calculate the average, which does not take into account the proportion of each category in the dataset. This approach results in a greater penalty when the model performs poorly in minority categories. All results presented in this study are macro-averaged.

The performance of a classifier is typically evaluated using the AUC-ROC curve, which takes into account the true positive rate (TPR) and false positive rate (FPR) at various thresholds. A higher AUC–ROC value corresponds to better classification performance, as it indicates a superior trade-off between TPR and FPR. Therefore, this curve is an important metric for evaluating the performance of a classifier and can aid in selecting the best classifier and optimizing its parameters.

Data preprocessing

To normalize the data and reduce errors from data magnitude differences, we use min–max scaling to map the values to the [0, 1] range. Then, we divide the data into training and testing sets: 1961–2014 for training and 2015–2018 for testing. We randomly select 20% of the data from the training set as a validation set.

Parameter settings

To prevent overfitting, a dropout layer with a 20% rate is incorporated into the model. The Adam optimization algorithm (Kingma & Ba 2015) is utilized to iteratively update the weight parameter matrix, with a learning rate of 0.0001. We apply early stopping with a patience of 10 to stop training when the validation loss does not decrease for 10 consecutive rounds but increases instead. After multiple tests, we found that a batch size of 32 is optimal for the model. We set the time step between 5 and 15 and choose the best time step based on the prediction results of each model at each time scale. Through experiments, we set the timestep of 1, 3, 6, 12, and 24 months to 5, 8, 10, 12, and 15, respectively, that is, we use the time steps of the past 5, 8, 10, 12, and 15 months of data to predict the drought category for the next month at each scale.

Results

To practically validate the prediction accuracy of MSTSN for drought category prediction, we compare MSTSN with classical deep learning models, including LSTM, fully convolutional network (FCN), MLSTM-FCN, and GNN-RNN. These four models are briefly introduced as follows:

LSTM (Dikshit et al. 2021): a variant of recurrent neural networks is designed to capture long-term dependencies in sequential data.

FCN (Wang et al. 2017): a neural network architecture that relies on convolutional layers to extract features from the input time series data. Subsequently, global pooling and fully connected layers are utilized for classification.

MLSTM-FCN (Karim et al. 2019): a hybrid architecture that integrates both LSTM and FCN designs to effectively capture both short-term and long-term temporal dependencies within time series data.

GNN-RNN (Fan et al. 2022): the combination of GNN and RNN is a powerful technique that synergistically blends spatio-temporal information, enabling accurate predictions that incorporate both geospatial and temporal dependencies.

To ensure a fairer comparison and demonstrate the superiority of our model, we selected the best-performing hyperparameters for each model. Table 4 shows the chosen hyperparameters for MSTSN. The detailed results of the model are presented in Table 5, which displays the performance of five models: LSTM, FCN, MLSTM-FCN, GNN-RNN, and MSTSN in predicting drought categories at different time scales.

Table 4

Parameters list

ParametersValue
Learning rate 0.0001 
Batch size 32 
Dropout rate 0.2 
Optimizer Adam 
Regularization strategy Early stopping 
ParametersValue
Learning rate 0.0001 
Batch size 32 
Dropout rate 0.2 
Optimizer Adam 
Regularization strategy Early stopping 
Table 5

The results of the five models at different time scales

Time scaleModelAccuracyF1RecallPrecision
1 month LSTM 0.573 0.179 0.222 0.159 
FCN 0.639 0.243 0.226 0.271 
MLSTM-FCN 0.576 0.260 0.254 0.284 
GNN-RNN 0.604 0.265 0.260 0.296 
MSTSN 0.655 0.289 0.279 0.311 
3 months LSTM 0.658 0.468 0.205 0.242 
FCN 0.649 0.473 0.469 0.478 
MLSTM-FCN 0.610 0.196 0.210 0.199 
GNN-RNN 0.654 0.460 0.386 0.387 
MSTSN 0.661 0.449 0.425 0.494 
6 months LSTM 0.801 0.296 0.333 0.267 
FCN 0.743 0.417 0.424 0.471 
MLSTM-FCN 0.721 0.606 0.625 0.614 
GNN-RNN 0.734 0.446 0.517 0.468 
MSTSN 0.871 0.687 0.710 0.668 
12 months LSTM 0.716 0.299 0.311 0.296 
FCN 0.761 0.534 0.522 0.553 
MLSTM-FCN 0.736 0.509 0.495 0.527 
GNN-RNN 0.816 0.652 0.612 0.647 
MSTSN 0.847 0.709 0.708 0.721 
24 months LSTM 0.703 0.478 0.478 0.499 
FCN 0.768 0.718 0.735 0.723 
MLSTM-FCN 0.738 0.554 0.431 0.831 
GNN-RNN 0.842 0.766 0.775 0.834 
MSTSN 0.870 0.803 0.768 0.843 
Time scaleModelAccuracyF1RecallPrecision
1 month LSTM 0.573 0.179 0.222 0.159 
FCN 0.639 0.243 0.226 0.271 
MLSTM-FCN 0.576 0.260 0.254 0.284 
GNN-RNN 0.604 0.265 0.260 0.296 
MSTSN 0.655 0.289 0.279 0.311 
3 months LSTM 0.658 0.468 0.205 0.242 
FCN 0.649 0.473 0.469 0.478 
MLSTM-FCN 0.610 0.196 0.210 0.199 
GNN-RNN 0.654 0.460 0.386 0.387 
MSTSN 0.661 0.449 0.425 0.494 
6 months LSTM 0.801 0.296 0.333 0.267 
FCN 0.743 0.417 0.424 0.471 
MLSTM-FCN 0.721 0.606 0.625 0.614 
GNN-RNN 0.734 0.446 0.517 0.468 
MSTSN 0.871 0.687 0.710 0.668 
12 months LSTM 0.716 0.299 0.311 0.296 
FCN 0.761 0.534 0.522 0.553 
MLSTM-FCN 0.736 0.509 0.495 0.527 
GNN-RNN 0.816 0.652 0.612 0.647 
MSTSN 0.847 0.709 0.708 0.721 
24 months LSTM 0.703 0.478 0.478 0.499 
FCN 0.768 0.718 0.735 0.723 
MLSTM-FCN 0.738 0.554 0.431 0.831 
GNN-RNN 0.842 0.766 0.775 0.834 
MSTSN 0.870 0.803 0.768 0.843 

In performance comparisons of multiple algorithms, the best performing results are shown in bold.

From the perspective of accuracy, MSTSN performs the best at all five time scales, followed by GNN-RNN, LSTM, and FCN, while MLSTM-FCN has the lowest accuracy. For example, at the 6-month time scale, MSTSN's accuracy is 15% higher than MLSTM-FCN's. The use of graph neural networks to extract spatial characteristics, as seen in MSTSN and GNN-RNN, proved to be more accurate than methods that only consider temporal characteristics. The accuracy rate of MSTSN is higher than that of GNN-RNN, which can be attributed to the effectiveness of the TEM module. This module uses GRU to extract the long-term dependencies of data and enhance its effect by assigning a weight to each month through multi-head self-attention. From the perspective of F1 score, our model scores the highest, except for the FCN model, which performs slightly better than MSTSN at the 3-month time scale. A high macro-F1 score means the model can classify drought categories accurately and fairly without bias toward the majority class. Therefore, MSTSN is superior to other methods, particularly for small-scale, high-intensity droughts.

From the overall perspective, the five models perform worst at the 1-month time scale. However, as the time scale increased, the predictive accuracy of the five models improved. This improvement may be attributed to the reduction in data and the tendency of data sequences to become more stable with increased time scales, leading to better predictive performance.

To illustrate the performance of each model in each category, we show the AUC-ROC curves for the 6-month time scale. For each category, we calculate the FPR and TPR for each threshold based on the probabilities and labels of each sample. The multivariate AUC–ROC curve plots the classifier performance of multiple categories, and the area under the curve measures the overall performance of the classifier. Figure 5 shows the curves for each model's prediction of each drought category, representing the performance of each model in predicting different drought categories. Drought categories 3 and 4 have no instances at the 6-month time scale, so their areas are empty. Although MSTSN does not have the largest area under the curve for each category, it has a consistently large area under the curve, indicating that MSTSN is more stable than the other three models. LSTM performs the worst among the four models.
Figure 5

AUC–ROC curves of the five models at the 6-month time scale. (a) LSTM. (b) FCN. (c) MLSTM-FCN. (d) GNN-RNN. (e) MSTSN.

Figure 5

AUC–ROC curves of the five models at the 6-month time scale. (a) LSTM. (b) FCN. (c) MLSTM-FCN. (d) GNN-RNN. (e) MSTSN.

Close modal

Sensitivity analysis

To better verify the effectiveness of our model, we compared the accuracy and F1 values of different models in predicting drought category divided by SPEI-6 at different time steps. The results are shown in Figure 6. We set the range of time steps between 5 and 15 to examine the impact of different input lengths on model performance. On the one hand, we can see that no matter what time step, MSTSN has the highest accuracy and F1 values, indicating that this model has high generalization ability on the dataset and can effectively classify various input situations. In contrast, MLSTM-FCN has the lowest accuracy and LSTM has the lowest F1 value, indicating that these models have poor predictive performance. On the other hand, we can also see that when the time step is 10, our model's accuracy and F1 values reach the optimal level. Therefore, we set the time step to 10 when predicting SPEI-6. To ensure the fairness of the experimental results, other models also select their own highest accuracy time steps for prediction.
Figure 6

(a) Comparison of prediction accuracy for baseline models at different time scales. (b) Comparison of F1 score for baseline models at different time scales.

Figure 6

(a) Comparison of prediction accuracy for baseline models at different time scales. (b) Comparison of F1 score for baseline models at different time scales.

Close modal

Error analysis

To analyze the errors of the test results in detail, we visualize the confusion matrix of the test results, as shown in Figure 7. The horizontal axis represents the predicted labels, the vertical axis represents the actual labels, and the color brightness represents the prediction probability. The higher the brightness on the diagonal position, the higher the accuracy of that category. Since there are no instances of drought categories 3 and 4 at the 6-month time scale, they are not shown in the picture. It can be seen from the figure that the prediction accuracy of drought category 0 is the highest, reaching more than 90%, while the prediction accuracy of drought category 1 is the lowest. This is due to the data imbalance in the test cases, where drought category 0 has the highest proportion of test cases, while drought category 1 has the lowest proportion of test cases. This is easy to understand: the more test cases, the better the model's learning effect, and vice versa. Another reason is the drought category division problem. When the SPEI value is close to two drought categories at the same time, it is difficult to distinguish which category it belongs to. Therefore, this model does not perform well when predicting small-sample drought categories, but compared with other methods, it has made some progress.
Figure 7

Visualization of confusion matrix of test results.

Figure 7

Visualization of confusion matrix of test results.

Close modal

Ablation experiment

To evaluate the contributions of the SAM module and the components of the TEM Module: GRU layer, and multi-head self-attention layer to our model's performance on drought prediction, we perform ablation experiments on five time scales. Table 6 shows the average evaluation metrics for each ablation setting. ‘MSTSN-SAM’ means removing the SAM module, while ‘MSTSN-TEM-GRU’ and ‘MSTSN-TEM-ATT’ mean removing the corresponding layer from the TEM block.

Table 6

Average results of ablation experiments at different time scales

ModelAccuracyF1RecallPrecision
MSTSN 0.780 0.587 0.578 0.607 
MSTSN-SAM 0.742 0.547 0.535 0.565 
MSTSN-TEM-GRU 0.754 0.554 0.554 0.564 
MSTSN-TEM-ATT 0.762 0.549 0.546 0.567 
ModelAccuracyF1RecallPrecision
MSTSN 0.780 0.587 0.578 0.607 
MSTSN-SAM 0.742 0.547 0.535 0.565 
MSTSN-TEM-GRU 0.754 0.554 0.554 0.564 
MSTSN-TEM-ATT 0.762 0.549 0.546 0.567 

In performance comparisons of multiple algorithms, the best performing results are shown in bold.

Table 6 shows that MSTSN outperforms the model without the SAM module, which confirms the correlation of drought conditions among adjacent grid points and the effectiveness of graph neural networks for spatial feature extraction. By handling long-term dependencies and capturing the temporal relationship with the GRU layer, the model can better predict the future drought development trend based on the historical drought situation, which further improves the accuracy by 2.6%. Since the data of each month have different impacts on drought, the multi-head self-attention layer assigns a weight to each month, allowing the model to focus on key months, which improves the accuracy by 1.8%. Generally, SAM has the greatest impact on the model. Removing the SAM module reduces the model's accuracy by 3.8%, which shows that introducing graph neural networks has an important influence on drought prediction.

Visualization of results

We use ArcMap to visualize the actual drought categories of 69 grid points from 2015 to 2018, based on the SPEI-6, along with the predicted drought category from five models: LSTM, FCN, MLSTM-FCN, GNN-RNN, and MSTSN. Figure 8 shows the actual drought categories from 2015 to 2018 and the drought categories predicted by each model in Henan province. According to the China Meteorological Statistical Yearbook, there was no large-scale meteorological drought in Henan Province in 2015. In 2016, the summer and autumn droughts in Henan Province were more severe. In 2017, most of Henan Province experienced drought, mainly concentrated in July–August. The climate was suitable in 2018, and there was no large-scale severe drought. The left column of Figure 8 shows the actual drought category divided by SPEI-6 from 2015 to 2018, which is roughly consistent with the actual drought situation. Therefore, SPEI-6 is suitable for drought prediction in Henan Province.
Figure 8

Spatial variation of the drought category range from 2015 to 2018. The columns depict the predicted values of LSTM, FCN, MLSTM-FCN, GNN-RNN, MSTSN, and the actual drought category from left to right, respectively. Legends represent drought categories.

Figure 8

Spatial variation of the drought category range from 2015 to 2018. The columns depict the predicted values of LSTM, FCN, MLSTM-FCN, GNN-RNN, MSTSN, and the actual drought category from left to right, respectively. Legends represent drought categories.

Close modal

From Figure 8, we can see that MSTSN has the closest prediction to the actual spatial distribution of drought and can predict the approximate time and range of drought, followed by GNN-RNN. LSTM fails to predict different categories of drought and predicts all drought categories as 0, which proves that it cannot predict categories with small sample sizes well. FCN and MLSTM-FCN have lower prediction accuracy, especially in predicting the first half of 2015 and 2018. When small-scale drought occurs, all five models have some deviations.

In this study, we have developed a novel MSTSN for predicting drought categories. The model consists of two modules: the SAM and the TEM. The SAM module aggregates information from neighboring grid points to incorporate spatial information and generate a new feature matrix. This matrix is then inputted into the TEM module, which extracts temporal features. Finally, the extracted features are fed into a softmax layer to obtain the final drought category predictions.

We compare MSTSN with common deep learning models and find that MSTSN has the highest prediction accuracy and F1 score at most time scales, indicating its ability to predict small-scale and high-intensity droughts. Additionally, we discover a positive correlation between the drought category and the SPEI time scale. Ablation experiments are conducted to analyze the contributions of each module. The results show that the SAM module improves the model's performance by leveraging graph neural networks to extract spatial features. The TEM module captures long-term dependencies through GRU and enhances its effectiveness by incorporating multi-head self-attention with weighted monthly assignments.

Although this model demonstrates improved accuracy in drought prediction by considering the correlation and temporal dependence between adjacent regions, it has some limitations. The study solely relies on distance to determine adjacency between grid points, disregarding important geographical features. Additionally, the GRU model used struggles to capture long-term temporal dependencies, leading to prediction errors in long-term tasks and an inability to accurately reflect real change trends.

In summary, this study proposes a novel drought category prediction model, MSTSN, and showcases its effectiveness and superiority through extensive experiments. By introducing a GNN and considering geographical knowledge, the model surpasses the limitations of treating each area as independent, resulting in improved prediction accuracy. This has significant implications for industries, such as agriculture, water resource management, and climate monitoring, providing more accurate drought monitoring and forecasting information for decision-makers.

However, future research should incorporate additional factors that influence drought, such as human factors, terrain, and other natural factors. Moreover, geographic features can be utilized to determine adjacency between grid points more accurately. To predict long-term drought conditions, advanced deep learning methods like transformers can be explored to capture temporal dependency relationships effectively.

This work was supported in part by the National Key Research and Development Program of China (No.2021YFE014400). This work was supported in part by the National Natural Science Foundation of China (No.62102187). This work was supported in part by the Science and Technology Development Fund of Egypt (No.43088).

1

GB/T20481-2017, National Standard of the People's Republic of China

Name of the code/library: MSTSN

Contact: e-mail and phone number

Hardware requirements: CPU: Intel Xeon Gold 6330; GPU: NVIDIA GeForce RTX A5000; Memory: 30 G.

Program language: Python 3.7

Software required: Pycharm, Anaconda3, Pandas, Numpy

Program size: 37KB

The source codes are available for download at the link: https://github.com/nuist-yjx/MSTSN.

All relevant data are available from an online repository or repositories: https://crudata.uea.ac.uk/cru/data/hrg/.

The authors declare there is no conflict.

Adnan
R. M.
,
Dai
H. L.
,
Kuriq
A.
,
Kisi
O.
&
Zounemat-Kermani
M.
2023
Improving drought modeling based on new heuristic machine learning methods
.
Ain Shams Engineering Journal
14
(
10
),
102168
.
Bhatti
U. A.
,
Tang
H.
,
Wu
G.
,
Marjan
S.
&
Hussain
A.
2023
Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence
.
International Journal of Intelligent Systems
2023
,
1
28
.
Chen
H.
,
Hong
P.
,
Han
W.
,
Majumder
N.
&
Poria
S.
2023
Dialogue relation extraction with document-level heterogeneous graph attention networks
.
Cognitive Computation
15
,
793
802
.
Dikshit
A.
,
Pradhan
B.
&
Huete
A.
2021
An improved SPEI drought forecasting approach using the long short-term memory neural network
.
Journal of Environmental Management
283
,
11197
.
Fan
J.
,
Bai
J.
,
Li
Z.
,
Ortiz-Bobea
A.
&
Gomes
C. P.
2022
A GNN-RNN approach for harnessing geospatial and temporal information: Application to crop yield prediction
.
Proceedings of the AAAI Conference on Artificial Intelligence
36
(
11
),
11873
11881
.
Fung
K. F.
,
Huang
Y. F.
,
Koo
C. H.
&
Mirzaei
M.
2020
Improved SVR machine learning models for agricultural drought prediction at downstream of Langat River Basin, Malaysia
.
Journal of Water and Climate Change
11
(
4
),
1383
1398
.
Ham
Y. G.
,
Kim
J. H.
&
Luo
J. J.
2019
Deep learning for multi-year ENSO forecasts
.
Nature
573
(
7775
),
568
572
.
Han
P.
,
Wang
P. X.
&
Zhang
S. Y.
2010
Drought forecasting based on the remote sensing data using ARIMA models
.
Mathematical and Computer Modelling
51
(
11–12
),
1398
1403
.
Harris
I.
,
Osborn
T. J.
,
Jones
P.
&
Lister
D.
2020
Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset
.
Scientific Data
7
(
1
),
109
.
Kapoor
A.
,
Ben
X.
,
Liu
L.
,
Perozzi
B.
,
Barnes
M.
,
Blais
M.
&
O'Banion
S.
2020
arXiv: Examining COVID-19 Forecasting using Spatio-Temporal Graph Neural Networks. https://doi.org/10.48550/arXiv.2007.03113 (accessed 6 July 2020).
Karim
F.
,
Majumdar
S.
,
Darabi
H.
&
Harford
S.
2019
Multivariate LSTM-FCNs for time series classification
.
Neural Networks
116
,
237
245
.
Kingma
D. P.
&
Ba
J.
2015
arXiv: Adam: A method for stochastic optimization. https://doi.org/10.48550/arXiv.1412.6980 (accessed 22 December 2014).
Lan
S.
,
Ma
Y.
&
Huang
W.
et al
2022
Dstagnn: Dynamic spatial-temporal aware graph neural network for traffic flow forecasting
. In
International Conference on Machine Learning
.
PMLR
, pp.
11906
11917
.
Le
J. A.
,
El-Askary
H. M.
,
Allali
M.
&
Struppa
D. C.
2017
Application of recurrent neural networks for drought projections in California
.
Atmospheric Research
188
,
100
106
.
Li
K.
,
Wan
D.
,
Zhu
Y.
,
Yao
C.
,
Yu
Y.
,
Si
C.
&
Ruan
X.
2020a
The applicability of ASCS_LSTM_ATT model for water level prediction in small-and medium-sized basins in China
.
Journal of Hydroinformatics
22
(
6
),
1693
1717
.
Li
Z.
,
Chen
T.
,
Wu
Q.
,
Xia
G.
&
Chi
D.
2020b
Application of penalized linear regression and ensemble methods for drought forecasting in Northeast China
.
Meteorology and Atmospheric Physics
132
(
1
),
113
130
.
Liu
J.
,
Lei
X.
,
Zhang
Y.
&
Pan
Y.
2023
The prediction of molecular toxicity based on BiGRU and GraphSAGE
.
Computers in Biology and Medicine
153
,
106524
.
Ma
T.
,
Rong
H.
,
Hao
Y.
,
Cao
J.
,
Tian
Y.
&
Al-Rodhaan
M.
2022
A novel sentiment polarity detection framework for Chinese
.
IEEE Transactions on Affective Computing
13
(
1
),
60
74
.
McKee
T. B.
,
Doesken
N. J.
&
Kleist
J.
1993
The relationship of drought frequency and duration to time scales
. In:
Proceedings of the 8th Conference on Applied Climatology
, Vol.
17
(
22
), pp.
179
183
.
Młyński
D.
,
Wałęga
A.
&
Kuriqi
A.
2021
Influence of meteorological drought on environmental flows in mountain catchments
.
Ecological Indicators
133
,
108460
.
Nagavciuc
V.
,
Ionita
M.
,
Perșoiu
A.
,
Popa
I.
,
Loader
N. J.
&
McCarroll
D.
2019
Stable oxygen isotopes in Romanian oak tree rings record summer droughts and associated large-scale circulation patterns over Europe
.
Climate Dynamics
52
,
6557
6568
.
Palmer
W. C.
1965
Meteorological Drought, Research Paper no. 45
.
US Weather Bureau
,
Washington, DC
, p.
58
.
Scarselli
F.
,
Gori
M.
,
Tsoi
A. C.
,
Hagenbuchner
M.
&
Monfardini
G.
2008
The graph neural network model
.
IEEE Transactions on Neural Networks
20
(
1
),
61
80
.
Vaswani
A.
,
Shazeer
N.
,
Parmar
N.
,
Uszkoreit
J.
,
Jones
L.
,
Gomez
A. N.
,
Kaiser
L.
&
Polosukhin
I.
2017
Attention is all you need
.
Advances in Neural Information Processing Systems
30
,
5998
6008
.
Vicente-Serrano
S. M.
,
Beguería
S.
&
López-Moreno
J. I.
2010
A multiscalar drought index sensitive to global warming: The standardized precipitation evapotranspiration index
.
Journal of Climate
23
(
7
),
1696
1718
.
Wadwekar
M.
&
Kapshe
M.
2023
Assessing spatial variation in water supply of a city using Dasymetric mapping
.
Journal of Hydroinformatics
.
https://doi.org/10.2166/hydro.2023.199 (22 May 2023).
Wang
Z.
,
Yan
W.
&
Oates
T.
2017
Time series classification from scratch with deep neural networks: A strong baseline
. In:
2017 International Joint Conference on Neural Networks (IJCNN)
.
IEEE
, pp.
1578
1585
.
Yan
Y.
,
Zhang
W.
,
Liu
Y.
&
Li
Z.
2023
Simulated annealing algorithm optimized GRU neural network for urban rainfall-inundation prediction
.
Journal of Hydroinformatics
.
Available from: https://iwaponline.com/jh/issue/ (26 May 2023).
Zhou
H.
,
Ma
T.
,
Rong
H.
,
Qian
Y.
,
Tian
Y.
&
Al-Nabhan
N.
2022
MDMN: Multi-task and domain adaptation based multi-modal network for early rumor detection
.
Expert Systems with Applications
195
,
116517
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).