Climate change and water supply shortages are paramount global concerns. Drought, a complex and often underestimated phenomenon, profoundly affects various aspects of human life. Thus, early drought forecasting is crucial for strategic planning and water resource management. This study introduces a novel hybrid model, combining wavelet transform with the Autoregressive Integrated Moving Average (ARIMA) model, known as Wavelet ARIMA (W-ARIMA), to enhance drought prediction accuracy. We meticulously analyze monthly precipitation data from January 1970 to December 2019 in Kabul, Afghanistan, focusing on multiple time scales (SPI 3, SPI 6, SPI 9, SPI 12). Comparative assessment against the conventional ARIMA approach reveals the superior performance of our W-ARIMA model. Key statistical indicators, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE), underscore the improvements achieved by the W-ARIMA model, notably in SPI 12 forecasting. Additionally, we evaluate performance using metrics like R-square, NSE, PBIAS, and KGE, consistently demonstrating the W-ARIMA model's superiority. This substantial enhancement highlights the innovative model's clear superiority in drought forecasting for Kabul, Afghanistan. Our research underscores the critical significance of this hybrid model in addressing the challenges posed by drought within the broader context of climate change and water resource management.

  • In this research article, a hybrid wavelet autoregressive integrated moving average (ARIMA) model is proposed for drought forecasting based on SPI. The proposed model showed significant excellency over the individual ARIMA model for drought forecasting according to statistical metrices.

  • A hybrid W-ARIMA model significantly outperforms individual ARIMA models in drought forecasting based on SPI, as evidenced by statistical metrics.

As an environmental catastrophe that almost occurs in every climate, drought has enticed the attention of researchers in the diverse field of study including agriculture, meteorology, environment, and ecology in recent years (Mishra & Singh 2010). Drought, which means the shortage of soil moisture in a particular period and decreasing water supplies in both the surface and groundwater reservoirs, has a negative impact on human life. Because drought is an incomprehensible event and leads to some negative effects on society, forecasting of occurrence of drought can be considered among the logical measurements to diminish its effect (Wilhite 2005).

Drought can lead to different impacts in different regions. To show a function of hydro-meteorological variables such as precipitation and streamflow, drought indices (DIs) are frequently used for analyzing its impact. For four types of droughts including meteorological, hydrological, agricultural, and socioeconomic, DI can be used for evaluation. However, there is no guarantee for the onset of drought occurrence, and it is necessary to monitor drought events by indices, so, the availability of hydro-meteorological data and the potential chosen of DI should be assessed since drought changes temporarily and spatially (Khan et al. 2020). Meteorological droughts gradually emerge, offering time for preparation and alertness, unlike sudden disasters. These droughts, initially mild in impact, provide a chance for proactive water conservation and sustainable practices. As they progress, public awareness about water use increases, encouraging positive behavioral shifts. Moreover, meteorological droughts drive scientific advancements, inspiring innovations in prediction and resilience technologies. Importantly, their gradual onset reduces immediate risks to life and infrastructure, giving communities more adaptation time. While benefits can vary, embracing these opportunities enables societies to enhance resilience, adopt sustainable measures, and effectively manage a range of drought-related challenges. The significance of the meteorological viewpoint arises from its capacity to serve as an initial indicator of impending drought conditions (Wu et al. 2004; Eslamian et al. 2017). Several indices including the standard precipitation evaporation index, effective drought index, and standardized precipitation index (SPI) are previously used to evaluate the meteorological drought. SPI stands as one of the most commonly used DIs, offering a recent approach for assessing drought severity in accordance with the guidelines set forth by the World Meteorological Organization (WMO) (Hayes et al. 2011) report. In this study, SPI has been chosen because of its inherent benefits. First, SPI is just calculated based on precipitation, so it will be a great advantage, particularly in the area without access to soil moisture, temperature, and evaporation. Second, SPI was introduced as a means to assess deficits in precipitation across various timescales. Shorter or longer timescales may reflect lags in the response of different water resources to precipitation anomalies (Mishra & Desai 2005). Finally, calculating SPI is less complex and is a standardized index that can be implemented in many different regions (Guttman 1998; Zargar et al. 2011).

There are several approaches for predicting time series events including the autoregressive integrated moving average (ARIMA) model, which is famous for its remarkable accuracy in forecasting time-oriented occurrences. As a reliable method, ARIMA has been widely used in time series forecasting such as streamflow and drought forecasting because of some of its benefits over other methods such as exponential smoothing and neural networks (Mishra & Desai 2005). For example, ARIMA can effectively consider serial correlation, which is mostly observed in time series modeling. This model can also provide a searching stage including identification, estimation, and diagnostic checking for selecting a suitable model. ARIMA is a well-preferred approach in many types of time series data owning of its flexibility and prediction precision (Zhang 2003; Mishra & Desai 2005). However, based on (Nourani et al. 2009), it is assumed that ARIMA could be successfully applied in linear and stationary datasets and has a limited ability to capture nonlinear and nonstationary time-oriented data.

Several methods are available in the literature for time series forecasting such as simple moving averages (MAs), linear regression, neural networks, auto regressive moving average (ARMA), and auto regressive integrated moving average (ARIMA). To predict future events, these methods analyze historical records, while time series data are not determinist series, and researchers considered these series as stationary series. Considering time series as a combination of deterministic function and white noise is a way of modeling time series. Using a de-nosing procedure such as wavelet transform, the white noise of any time series can be minimized and a better model can be obtained (Al Wadia & Ismail 2011).

In many fields of mathematical forecasting, wavelet transform along with stochastic and artificial intelligence methods has been frequently used to augment the precision of prediction (Nourani et al. 2014;). Wang & Ding (2003) reported wavelet as one of the useful tools for drought forecasting. In this study, they combined wavelet transform and artificial neural network (ANN), and the presented model picked up some merits of wavelet and ANN and increased the accuracy of drought prediction. In a previous study, Kriechbaumer et al. (2014) used wavelet transform as a data preprocessing way to improve the accuracy of the ARIMA model in metal price forecasting. The result of this study confirmed the usefulness of wavelet transform. Venkata Ramana et al. (2013) combined wavelet transforms with ANN and applied them in predicting monthly rainfall data, and then, the calibration and validation performances of the model were assessed with proper statistical criteria. The outcomes of this study showed that hybrid wavelet ANN can significantly improve the accuracy of monthly rainfall series over single ANN. Wavelet transform as an accepted method gained a wide attention of many researchers for time series analyzing trends particularly, periodicities and variations (Seo et al. 2017; Zhou et al. 2017; Quilty et al. 2019). By applying wavelet transform, a signal in both the time and frequency domains will generate and provide a reliable figure on the arrangement of the basic process to be modeled. Detailed information about data structure and its periodicity will be achieved by each decomposed subseries, and the results of research works illustrated that the wavelet-based approach is a promising technique for dealing with time-oriented datasets (Kim & Valdés 2003; Partal & Kişi 2007; Rathinasamy et al. 2014). Recently, integrated wavelets with stochastic models such as ARIMA and wavelet artificial intelligence such as ANN remarkably increased as a data preprocessing technique to de-noise inputs of hydrologic time series and improve the ability of the single ANN model because the sole ANN model is unable to deal with nonstationary series (Adamowski et al. 2012; Shabri 2015; Belayneh et al. 2016; Soh et al. 2018).

A study by Belayneh & Adamowski (2012) compared the ability of three data-driven techniques including SVM, ANN, and WANN for drought prediction in the Awash River Basin of Ethiopia based on SPI values. The performances of the models were assessed according to root-mean-square error (RMSE), mean absolute error (MAE), and R2 criteria, and the hybrid model of WANN illustrated high accuracy for drought forecasting over two other methods. Two years later, the authors, in addition to the comparison of the performances of the traditional stochastic model (ARIMA), SVR, and ANN for prediction of drought occurrences at the same river basin, implemented the wavelet transform as a suitable data preprocessing technique in those models. In this study, the WSVR was implemented for the first time for predicting drought events. The outcomes from this study disclosed the excellence of WSVR for long-term (6- and 12-month lead time) drought prognosticating based on SPI amounts according to statistical measurements (Belayneh et al. 2014). A new combination model, namely, wavelet linear genetic Programming (WLGP) was applied for the prediction of drought events based on Palmer's modified drought index. The results confirmed the ability of WLGP methods over the traditional genetic programing models for long-term drought prediction, while the simple genetic model was unable to model over a 3-month lead time (Danandeh Mehr et al. 2014). To analyze the accuracy of drought modeling, Djerbouai & Souag-Gamane (2016) compared stochastic models (ARIMA/seasonal ARIMA (SARIMA)) with the ANN models using SPI values for different lead timescales in the Algerois basin, Algeria. They used wavelet transform as a data preprocessing technique to improve the ability of the ANN model. The results from this study show that wavelet transform can significantly increase the accuracy of the model over sole utilization of ANN based on statistical performance measures such as Nash–Sutcliffe efficiency (NSE) coefficient, RMSE, and MAE for all SPI time series lead time ranging from 1 to 6 month. A new hybrid model was suggested for predicting DIs known as a wavelet-based extreme learning machine (WELM) in three distinct stations in Australia. This combination of methods was compared with extreme learning machine (ELM), ANN, LSSVR, and their wavelet equivalent (WANN and WLSSVR). From the performance measures and statistical criteria, it can be easily found that WELM has a strong ability to predict drought incidents over the ELM, ANN, LSSVR, and their wavelet correspondent models. Moreover, WANN shows satisfactory outputs associated with simple computation and lower frequency error rates. The study illustrates the efficiency of wavelet transform as an effective screening data input in improving drought forecasting models (Deo et al. 2017).

The main objective of this research article is to investigate the predictive accuracy of the ARIMA and wavelet autoregressive integrated moving average (W-ARIMA) models using the SPI for drought forecasting. In addition, this study aims to compare the predictive performance of the ARIMA model with the hybrid W-ARIMA model. The SPI values of varying timescales, including SPI3, SPI6, SPI9, and SPI12, are utilized to assess both short- and long-term drought conditions. While the ARIMA model has gained popularity in recent hydrological time series prediction, it exhibits limitations in handling nonlinear and nonstationary data, characteristics inherent to SPI series. This article addresses the dearth of academic research on drought forecasting in Kabul, Afghanistan – a region susceptible to the socioeconomic repercussions of drought. Introducing both the ARIMA and W-ARIMA models for precise prediction, the study undertakes a comparative analysis of their effectiveness.

Study area and data

Afghanistan has experienced below-normal to severe drought, which affected the availability and access to safe drinking water. According to the Afghanistan Assessment Report (Mayar 2021), around 79% of householders reported inadequate water access for daily activities such as drinking, cooking, bathing, and hygiene. So, forecasting drought provides a broad vision for upcoming years for policymakers and water supply managers to at least mitigate its devastating effects by taking effective measures in advance. In this study, the monthly precipitation data for Kabul, the capital and largest city of Afghanistan, from January 1970 to December 2019 are extracted from the World Bank Climate Change Knowledge Portal (https://climateknowledgeportal.worldbank.org). Kabul is located in the east central part of Afghanistan with the latitude and longitude coordinates of 34.543896 and 69.160652, respectively. This city is situated in a stripe-like right near the Hindu Kush Mountain. The total area of Kabul is estimated at around 400 square miles, and the river of Darya-e Kabul (the river of Kabul) crosses the city from its eastern to the western side. Kabul Province is home to the capital city of Kabul and features a diverse scenery of mountains, valleys, and plains. The province has a dry to semi-dry climate with distinct seasons, and because of irregular and little rainfall, it frequently encounters drought problems. The Kabul River and its tributaries are used for irrigation as well as to grow commodities such as wheat and fruits, which are a major part of the local economy. However, droughts pose serious hazards because they have an impact on crops, water supplies, and daily living. Due to Kabul's high population density, droughts have a significant negative influence on both its social and economic elements. Modern technologies and data analysis techniques may be able to help with effective prediction and reaction plans. Figure 1 shows the study area.

In this study, we utilize monthly precipitation data from Kabul, Afghanistan, for the purpose of model validation. The country has witnessed a substantial increase in drought severity between 1901 and 2010. Despite this, there exists a scarcity of scientific works pertaining to drought prediction in Afghanistan. This study stands out as one of the pioneering pieces of research focusing on drought forecasting in Afghanistan through the implementation of data-driven approaches. The dataset for this case study encompasses 50 years of historical monthly time series precipitation records, spanning from January 1970 to December 2019. Subsequently, these records are divided into distinct training and testing sets. The training set comprises 80% of the total data, while the testing set accounts for the remaining 20%. Table 1 presents a comprehensive overview of descriptive statistics for Kabul Province, the geographical area under investigation, encompassing key metrics such as mean, median, standard deviation, skewness, and kurtosis. See Supplementary Data for details.

Table 1

Descriptive statistics of monthly precipitation data for Kabul, Afghanistan

LocationMeanStandard deviationMinMedianMaxSkewnessKurtosis
Kabul (KBL) 45.412 36.890 0.610 33.120 200.270 1.274 1.512 
LocationMeanStandard deviationMinMedianMaxSkewnessKurtosis
Kabul (KBL) 45.412 36.890 0.610 33.120 200.270 1.274 1.512 

Standardized precipitation index

The SPI is a common drought index frequently used for identifying drought events. It was initially suggested by McKee et al. (1995) and utilizes historical precipitation data in space. It encompasses both positive and negative values, with positive values illustrating surplus, while negative values indicating shortage events. Despite the existence of several other DIs, the WMO attempted to develop a standard index that can be utilized as a starting point for every region and country. Other indices can only be applied in specific areas and require large datasets and complex procedures for execution (Yihdego et al. 2019). As mentioned earlier, SPI is based solely on precipitation, making its evaluation relatively easy compared to other indices such as the Palmer index and crop moisture index (Cacciamani et al. 2007). SPI is considered the index for representing variability in Eastern African drought (Ntale & Gan 2003), and it can describe drought at multiple timescales, which is one major benefit of using SPI in predicting drought occurrences.

SPI can be computed by fitting a probability distribution to aggregate monthly precipitation series (3, 6, 9, 12, and 24 months), then this probability density function (PDF) transformed into a normal standardized index whose values classify the category of drought characteristics in each place and timescale (Belayneh & Adamowski 2013).

The gamma distribution is used to fit lengthy periods of historical rainfall records because it closely matches the sequence of rainfall episodes. The PDF derived from the gamma distribution is shown in Equation (1).
(1)
where the scale, shape variables, quantity of precipitation, and gamma function are represented by , , x, and , respectively. The best estimates of and bounds are provided by Equations (2) and (3) (Guttman 1999).
(2)
where the rainfall average and the total number of observations are represented by the letters x and n, respectively.
(3)
(4)
The variables required to calculate the cumulative probability for nonzero rainfalls are shown in Equation (5).
(5)
where
For x = 0, the gamma function is uncertain, the rainfall time series data may include zero precipitation, the cumulative probability of zero and nonzero precipitation are calculated, and Equation (6) is used to determine H(x).
(6)
where q denotes the probability of no precipitation, is the number of zeros (which occur in a rainfall time series), and is calculated using m/n.
Once the cumulative probability is changed to a standardized normal distribution (Sönmez et al. 2005), as demonstrated by Equations (7) and (8), the SPI mean and variance would equal zero and one, respectively.
(7)
(8)

The SPI classification is presented in Table 2.

Table 2

Classification of drought and wetness using SPI

SPI valuesCategory
+2 and higher than +2 Extremely wet 
1.5–1.99 Very wet 
1–1.49 Moderately wet 
−0.99 to 0.99 Nearly normal 
−1.49 to −1 Moderate drought 
−1.99 to −1.5 Severe drought 
−2 and less than −2 Extreme drought 
SPI valuesCategory
+2 and higher than +2 Extremely wet 
1.5–1.99 Very wet 
1–1.49 Moderately wet 
−0.99 to 0.99 Nearly normal 
−1.49 to −1 Moderate drought 
−1.99 to −1.5 Severe drought 
−2 and less than −2 Extreme drought 

ARIMA model

Linear stochastic models, also referred to as time series models, have been frequently applied in hydrological and meteorological drought forecasting due to their ability to provide a systematic empirical approach for predicting and analyzing time series events over the past decades. One of the most commonly used methods in this category is the ARIMA model, initially introduced by Box & Jenkins (1976), which is a suitable solution for addressing nonstationarity in historical time series records. This property of ARIMA arises from the effective combination of two simpler models, including autoregressive (AR) and MA. The general nonseasonal ARIMA model involves an AR order of p, an MA order of q, and d differences for the original time series . This model is denoted as ARIMA(p,d,q) and can be represented as follows:
(9)
where and are polynomials of order p and q, respectively. The operators and will be described as follows:
(10)
(11)

SARIMA model

Box et al. (1994) developed the ARIMA model and introduced the SARIMA model to deal with nonstationary time series with seasonal characteristics because many time series in hydrology have periodic properties. Generally, the SARIMA model is known as , where the illustrates the nonseasonal part of the original series, while the seasonal part of the series will be described by , which can be written as follows:
(12)
where p is the order of the nonseasonal AR model, q is the order of the nonseasonal MA model, d is the number of regular differencing, P is the order of seasonal autoregression, D is the number of seasonal differencing, Q is the order of seasonal MA, and eventually s is the length of the season.

Both ARIMA and SARIMA are time series forecasting models that are used to predict future values based on past observations. The main difference between these two models lies in their treatment of seasonal components. In summary, the main difference between ARIMA and SARIMA models is the inclusion of seasonal components in SARIMA to handle time series data with recurring patterns at regular intervals. If time series data exhibits a seasonal pattern, SARIMA may provide more accurate forecasts compared to ARIMA, which is better suited for nonseasonal data. The choice between ARIMA and SARIMA depends on the characteristics of time series data and the patterns you are trying to capture.

Wavelet decomposition

Wavelet, a mathematical model, serves as a transformative tool that converts the original signal, primarily in the time domain, into various domains for analysis and processing (Soltani 2002; Moosavi et al. 2013). It is a widely utilized model for handling nonstationary datasets, encompassing hydrological and climatological records, wherein the mean and autocorrelation of the signal exhibit inherent inconsistencies over time.

Wavelet is a powerful method in time series forecasting that researchers have implemented in many different fields of studies, including drought forecasting, in combination with other statistical and machine learning methods. When dealing with time series analysis, we encounter linear and nonlinear time-oriented records where ARIMA and ANN can be applied, respectively, under these conditions. Khan et al. (2020) showed that the sole application of ARIMA and ANN has a low ability in drought forecasting precision, while the combined approach of ARIMA-ANN improved the accuracy of the model. They developed a new discrete wavelet transform (DWT) based on ARIMA-ANN, known as W-ARIMA-ANN (W-2A), which helped in improving drought forecasting compared to both single and hybrid ARIMA-ANN models, considered meteorological DIs. Wavelet can be divided into two categories, including continuous wavelet transform and DWT, where the former is rarely used in time series forecasting due to its computational complexity and significant time requirement, while DWT is frequently used in prediction implementation to simplify the numerical solution (Shabri 2014). The following equation represents DWT.
(13)
where illustrates mother wavelet, n and m are integers that control the time and scale, respectively. Parameters and can be considered most popular choices. Based on Mallat's theory, the inverse DWT can be used to decompose the original discrete time series into a series of linearity-independent approximation and detail signals (Mallat 1989). The inverse DWT is defined by Mallat as the following equation:
(14)

For the discrete wavelet at scale and , wavelet coefficient can be considered .

As mentioned earlier, the aim of this research article is to enhance the accuracy of the ARIMA model for drought forecasting using wavelet transform as a data preprocessing step and to introduce the W-ARIMA model, a hybrid approach achieved by combining the wavelet and ARIMA models. In this approach, the original drought SPI is decomposed into sub-time series elements, which can then be utilized as inputs in the ARIMA model to enhance forecasting precision. Several levels of decomposition can be obtained using the following formula (Dawson et al. 2007):
(15)
where L indicates the decomposition level and N represents the number of SPI data series. In this study, N= 600, thus. According to this formula, the original SPI data can be decomposed into several component levels (A, D1,D2,, DL−1), each representing different frequency components of the original data. These components are unique and exert distinct effects on the original data. To enhance the forecast performance of the hybrid model, researchers incorporate an appropriate D component into the model instead of utilizing D components independently. A schematic diagram of the wavelet transform is displayed in Figure 2, which indicates decomposed time series, the approximate composition of the layer, and detailed components of all layers. The sum of detail components of all layers and the approximate element of the last layer constitutes the decomposed time series, which is mathematically represented in the following equation:
(16)
Figure 1

Location map of Kabul province and Kabul River Basin with the meteorological stations.

Figure 1

Location map of Kabul province and Kabul River Basin with the meteorological stations.

Close modal
Figure 2

Schematic diagram of wavelet transform.

Figure 2

Schematic diagram of wavelet transform.

Close modal
Since the three-level wavelet transform of the signal is displayed in Figure 2, Equation (17) can be obtained according to Equation (16) like this.
(17)

We used DWT in this study because continuous wavelet transform requires more data and generates more information which is not suitable for this study (Che & Zhai 2022).

Hybrid W-ARIMA model

The hybrid W-ARIMA model combines the DWT and ARIMA techniques for time series analysis. It decomposes the original time series using DWT to capture both high-frequency and low-frequency patterns. ARIMA models are then applied separately to each decomposed component to forecast future values. The forecasts from the ARIMA models are inverse transformed using wavelets to reconstruct the final hybrid forecast for the original time series, thereby leveraging the strengths of both methods to enhance accuracy in modeling and forecasting complex time series data. The flow chart of this method is shown in Figure 3. The suggested hybrid model is composed of three following stages:
  • 1.
    In the first stage, the original SPI values are decomposed using DWT by MATLAB. This transformation separated data into proper approximate and detailed components. Several wavelet decompositions are suggested including Daubechies, Symlet, Meyer, and Morlet, in which the type of mother wavelet is dependent on the characteristic of data (Benaouda et al. 2006; Nury et al. 2017). In this study, the Daubechies function of order 2 and decomposed level 3 are used. So, we have
    (18)
    where the decomposed layer of data is represented by n, while Dn and An indicate the detail and approximation components of each layer, respectively. To make the temporal scale constant with the original data, the approximate element of the last layer (An), and detail components of each layer (D1, D2,, Dn) should be reconstructed.
  • 2.

    In the second phase, the best ARIMA models are fitted into each decomposed layer for every SPI series. To meet the appropriate ARIMA model, all iterative steps explained in Section 3.1 are implemented to make a particular model for each decomposed component.

  • 3.
    Finally, the signal will be reconstructed using these decomposed and extended signals on different scales with the help of the following equation. The forecasted value of W-ARIMA can be achieved by arithmetic summing all subseries predictions of each decomposed layer for each SPI.
    (19)
    where is the forecasted value of each SPI for the next year ahead.
Figure 3

W-ARIMA framework.

Figure 3

W-ARIMA framework.

Close modal

Evaluation metrices

In this study, we employed several different statistical criteria to gauge the forecasting performance and the reliability of the proposed prototype, aiming to verify the accuracy of the model. The RMSE represents the standard deviation of the model's forecasted values; MAE provides the average difference between actual data and predicted outcomes; mean absolute percentage error (MAPE) serves as an indicator of how accurately forecasted values align with actual quantities; R2 indicates the proportion of variability in observed data accounted for by the model; PBIAS (percent bias) is a hydrological performance metric quantifying the overall bias or systematic deviation of model forecasts from observed data; NSE is a standardized metric that assesses the proportion of residual variability (noise) relative to the variance of the observed data (information); and the Kling–Gupta efficiency (KGE) evaluates the level of agreement between the simulated values of a model and the observed data. The equations for these performance evaluation metrics are provided in Equations (20)–(26), with further information available in the literature (Moriasi et al. 2007; Tongal 2013; Buyukyildiz et al. 2014; Koycegiz & Buyukyildiz 2019, 2022, 2023; Wu et al. 2021).
(20)
(21)
(22)
(23)
(24)
(25)
(26)
where n represents the number of data, Yi indicates the observed data, shows the mean of observed data, and illustrates the predicted data. In the last equation, r represents the Pearson correlation coefficient between the simulated and observed values, α stands for the ratio of the standard deviation of the simulated data to that of the observed data, and β signifies the ratio of the mean of the simulated data to that of the observed data. Based on these criteria, a better model performance will be gained with smaller RMSE and MAE, and MAPE and large R2. KGE values vary within the range of –∞ to 1, where higher values signify improved model performance. A KGE value of 1 denotes a complete alignment between simulated and observed data, while lower values suggest less precise model performance. Table 3 (Koycegiz & Buyukyildiz 2019) presents the performance ratings of certain statistical indices that have been utilized.
Table 3

General performance ratings

R2PBias (%)NSEPerformance rating
   Very good (VG) 
   Good (G) 
   Satisfactory (S) 
   Unsatisfactory (U) 
   Inappropriate (I) 
R2PBias (%)NSEPerformance rating
   Very good (VG) 
   Good (G) 
   Satisfactory (S) 
   Unsatisfactory (U) 
   Inappropriate (I) 
In this study, the W-ARIMA model is proposed to enhance the accuracy of the ARIMA model in predicting drought events in the Kabul province based on SPI. The dataset is divided into two periods: a training phase from 1970 to 2009 and validation data from 2010 to 2019. The model development comprises three major stages: model identification, estimation of unknown parameters, and diagnostic checking. In this study, the original SPI values are decomposed using MATLAB, while R software is employed for model development and predicting future drought occurrences. To classify drought events into short-term and long-term categories, different SPI values are computed considering total precipitation for running periods of 3, 6, 9, and 12 months. SPI 3 indicates short-term drought, SPI 6 and SPI 9 illustrate mid-term drought events, while long-term drought can be represented by SPI 12. Different SPI values are plotted in Figure 4 based on historical records.
Figure 4

SPI time series based on the average precipitation over the Kabul province, Afghanistan.

Figure 4

SPI time series based on the average precipitation over the Kabul province, Afghanistan.

Close modal

ARIMA model

As mentioned earlier, to arrive at a suitable forecasting ARIMA model, we need to follow three iterative steps, which are described in the following sections.

Model identification

The stationarity and seasonality of the time series can be determined at this phase before parameter estimation. Autocorrelation function (ACF) and partial autocorrelation function (PACF) plots can be utilized to evaluate stationarity. In the case of a nonstationary time series, differential transformation can be applied to obtain stationary data. To determine the order of the ARIMA model, ACF and PACF plots are used. The best-fitted model can be chosen by goodness-of-fit criteria through the Akaike information criteria (AIC) and Schwarz–Bayesian criterion (SBC). According to these statistical measurements, the model with the lowest AIC and Bayesian information criterion is selected as the best (Akaike 1974; Schwarz 1978; Yeh & Hsu 2019). The mathematical formulation of these criteria is described as follows:
(27)
(28)
where k is illustrating the number of parameters in the model , L is the likelihood function of the ARIMA model, and n is the number of data in the model. The goal of model identification is to uncover a fitting subclass of the ARIMA model that accurately fits the time series data. Initially, assessing the stationarity of the records is necessary. To achieve this, the augmented Dickey–Fuller test was employed. The outcome confirmed the data's stationarity and indicated no need for differencing. This result has been visualized in Table 4.
Table 4

Stationarity assessment: augmented Dickey–Fuller test for SPI series

Augmented Dickey–Fuller test for SPI series
SPI 3SPI6SPI 9SPI 12
ADF statistic   − 7.1557 −5.8933 −7.7892 −5.8192 
p-Value  0.01 0.01 0.01 0.01 
Augmented Dickey–Fuller test for SPI series
SPI 3SPI6SPI 9SPI 12
ADF statistic   − 7.1557 −5.8933 −7.7892 −5.8192 
p-Value  0.01 0.01 0.01 0.01 
Figure 5 shows the ACF and PACF plots for the original time series of SPI 3 for illustration.
Figure 5

ACF and PACF plot for model selection of SPI 3.

Figure 5

ACF and PACF plot for model selection of SPI 3.

Close modal

Multiple models have been fitted to various SPI timescales. Table 5 presents a summary of the best ARIMA models based on the AIC and SBC criteria. The best model for each original SPI can be determined by selecting the model with the lowest AIC and SBC values.

Table 5

Summary of best selected ARIMA model for each SPI series based on AIC and SBC criterion

SPI seriesModelAICSBC
SPI 3 ARIMA(1,0,2)(1,0,0)[3] 1041.37 1062.22 
SPI 6 ARIMA(3,0,3)(0,0,1)[6] 687.65 720.95 
SPI 9 ARIMA(1,0,1)(1,0,2)[9] 442.6 467.54 
SPI 12 ARIMA(1,0,0)(1,0,2)[12] 172.99 194.88 
SPI seriesModelAICSBC
SPI 3 ARIMA(1,0,2)(1,0,0)[3] 1041.37 1062.22 
SPI 6 ARIMA(3,0,3)(0,0,1)[6] 687.65 720.95 
SPI 9 ARIMA(1,0,1)(1,0,2)[9] 442.6 467.54 
SPI 12 ARIMA(1,0,0)(1,0,2)[12] 172.99 194.88 

Parameter estimation

The subsequent step in constructing the best ARIMA model involves estimating the parameters of the selected model from the previous stage. Following the approach outlined by Box & Jenkins (1976), the maximum likelihood method, a technique for estimating parameters in statistical models, has been employed for parameter estimation. A robust estimator for the parameters can be computed by assuming data stationarity and maximizing the probability concerning the parameters. Generally, there are two conditions on parameters in estimating, one is stationary condition, which means , and the other is invertibility condition, which means . Model parameters, standard errors, t-statistics, and p-values are presented in Table 6. Clearly, the standard errors are relatively small in comparison to the values of the model parameters, and the majority of p-values in the models are significant. This indicates that these parameters can be included in the models.

Table 6

Statistical analysis of the model parameters for SPI timescales

SPI seriesModel parametersVariables in the model
Value of parametersStandard errort-Statisticp-Value
SPI 3  −0.1664 0.0686 −2.4253  0.0153 
 0.9304 0.0502 18.5219 2 × 10−16 
 0.7288 0.0458 15.8998 2 × 10−16 
 0.1170 0.0619 1.8897 0.0590 
SPI 6    − 0.6893  0.1894 −3.6387 0.0003 
 0.8144 0.0661 12.3223 2.2 × 10−16 
 0.6398 0.1600 3.9986 6.4 × 10−05 
 1.6030 0.1935 8.2849 2.2 × 10−16 
 0.5597 0.2276 2.4589 0.0139 
 −0.1123 0.0613 −1.8310 0.0671 
 -0.5218 0.0481 −10.8487 2.2 × 10−16 
SPI 9  0.9672 0.0127  76.1960 2.2 × 10−16 
 −0.0763 0.0485 −1.5719 0.1160 
 0.7301 0.2428 3.0072 0.0026 
 −1.3686 0.2535 −5.3986 6.7 × 10−08 
 0.4442 0.1820 2.4407 0.0146 
SPI 12  0.9826 0.0079 124.2860 2.2 × 10−16 
 −0.7153 0.2769 −2.5829 0.0098 
 −0.0713 0.2885 −0.2471 0.8049 
 −0.5015 0.2257 −2.2223 0.0263 
SPI seriesModel parametersVariables in the model
Value of parametersStandard errort-Statisticp-Value
SPI 3  −0.1664 0.0686 −2.4253  0.0153 
 0.9304 0.0502 18.5219 2 × 10−16 
 0.7288 0.0458 15.8998 2 × 10−16 
 0.1170 0.0619 1.8897 0.0590 
SPI 6    − 0.6893  0.1894 −3.6387 0.0003 
 0.8144 0.0661 12.3223 2.2 × 10−16 
 0.6398 0.1600 3.9986 6.4 × 10−05 
 1.6030 0.1935 8.2849 2.2 × 10−16 
 0.5597 0.2276 2.4589 0.0139 
 −0.1123 0.0613 −1.8310 0.0671 
 -0.5218 0.0481 −10.8487 2.2 × 10−16 
SPI 9  0.9672 0.0127  76.1960 2.2 × 10−16 
 −0.0763 0.0485 −1.5719 0.1160 
 0.7301 0.2428 3.0072 0.0026 
 −1.3686 0.2535 −5.3986 6.7 × 10−08 
 0.4442 0.1820 2.4407 0.0146 
SPI 12  0.9826 0.0079 124.2860 2.2 × 10−16 
 −0.7153 0.2769 −2.5829 0.0098 
 −0.0713 0.2885 −0.2471 0.8049 
 −0.5015 0.2257 −2.2223 0.0263 

Diagnostic checking

The third step in fitting an appropriate ARIMA model involves diagnostic checking, which entails examining whether the selected models from the previous stages genuinely satisfy the fundamental assumptions of time series. If the residuals of the obtained model are uncorrelated random variables with a mean of zero and constant variance, then the estimated model is suitable and confidently meets the basic assumptions, such as independence, normality, and homoscedasticity of the residuals. Alongside the residual ACF and residual PACF, the Ljung–Box Q (LBQ) statistic is employed to assess the overall adequacy of the model. The test statistic Q is expressed as follows:
(29)
where n is the number of residuals, m represents the number of time lags, and indicates the residual autocorrelation at lag k. The fitted models can be used to predict each SPI value if the tests and plots provide significant support for the hypothesis on residuals.
In this study, both the correlogram and the LBQ test were employed to examine the independence of residuals. Figure 6 demonstrates the independence of residuals for SPI 3, as the majority of the RACF and RPACF values fall within the confidence limits. This clearly indicates that there is no substantial evidence of residual correlation among them.
Figure 6

The ACF and PACF of residuals for SPI 3.

Figure 6

The ACF and PACF of residuals for SPI 3.

Close modal

Furthermore, the LBQ test was utilized to ascertain the independence of residuals. The outcomes of this test, as shown in Table 7, indicate that the residuals for each SPI value across various timescales were uncorrelated and exhibited the properties of a white noise process.

Table 7

LBQ statistics for SPI series

SPI seriesSPI 3SPI 6SPI 9SPI 12
x2 0.0042 0.0034 0.0137 1.2907 
p-Value 0.9486 0.9538 0.9069 0.2559 
SPI seriesSPI 3SPI 6SPI 9SPI 12
x2 0.0042 0.0034 0.0137 1.2907 
p-Value 0.9486 0.9538 0.9069 0.2559 

The histogram and normal probability plot are employed to assess the normality assumption of residuals. Figure 7 illustrates the histogram and normal probability test for SPI 3 as an example. It is evident that the residuals are distributed around zero and exhibit a relatively normal distribution (Mahmud et al. 2016; Rahman et al. 2017). In addition, since the residuals align closely with the diagonal line on the normal probability plot, the normality assumption is fulfilled (Mahmud et al. 2016; Widowati et al. 2016).
Figure 7

The histogram (left column) and normal probability plot (right column) of residuals for SPI 3.

Figure 7

The histogram (left column) and normal probability plot (right column) of residuals for SPI 3.

Close modal
A scatterplot of the residuals against the predicted values was generated to verify the homoscedasticity of the residuals and assess whether the fitted models maintain consistent predictive ability for variable values. Figure 8 portrays the scatterplots of residuals for SPI 3 against fitted values as an illustrative example. The findings indicate that these scatterplots for each SPI series lack any discernible pattern, implying that the residuals are randomly dispersed. Consequently, based on the diagnostic examination mentioned earlier, the residuals exhibit the characteristics of white noise: they are uncorrelated, normally distributed, randomly scattered, and possess constant variances. Thus, it can be concluded that the selected models are suitable for the respective SPI timescales.
Figure 8

The scatterplot of the residuals against predicted values for SPI 3.

Figure 8

The scatterplot of the residuals against predicted values for SPI 3.

Close modal
To evaluate the validity of the fitted models, we employ monthly data covering the period from 2010 to 2019. Figure 9 provides a visual comparison between observed data and predicted values for each SPI series, utilizing the optimal SARIMA model. Table 8 in Section 3.2 presents the statistical performance metrics for both the ARMA/SARMA model and the W-ARIMA model for comparative analysis.
Table 8

Summary of best selected ARIMA/SARIMA model for each decomposed SPI series based on AIC and SBC criterion

SPI seriesModelAICSBC
SPI 3 A3 ARIMA(2,0,0)(1,0,1)[3] −390.7 −370.22 
D3 ARIMA(2,0,0)(2,0,2)[3] −430.01 −401.4 
D2 ARIMA(1,0,0)(1,0,0)[3] −286.61 −270.14 
D1 ARIMA(2,0,0)(2,0,0)[3] −471.1 −450.47 
SPI 6 A3 ARIMA(2,0,2)(2,0,2)[6] −397.54 −360.84 
D3 ARIMA(5,0,0)(1,0,2)[6] −512.96 −476.26 
D2 ARIMA(5,0,0) −717.34 −692.5 
D1 ARIMA(0,0,3)(2,0,1)[6] −799.14 −765.92 
SPI 9 A3 ARIMA(2,0,2)(1,0,0)[9] −374.35 −345.75 
D3 ARIMA(4,0,0)(1,0,0)[9] −429.45 −404.9 
D2 ARIMA(5,0,0) −717.16 −692.5 
D1 ARIMA(1,0,0)(1,0,0)[9] −258.23 −241.71 
SPI 12 A3 ARIMA(2,0,2)(2,0,0)[12] −408.84 −376.19 
D3 ARIMA(4,0,0)(1,0,0)[12] −485.18 −460.63 
D2 ARIMA(4,0,0)(1,0,0)[12] −728.84 −704.18 
D1 ARIMA(1,0,0)(0,0,2)[12] −262.85 −242.22 
SPI seriesModelAICSBC
SPI 3 A3 ARIMA(2,0,0)(1,0,1)[3] −390.7 −370.22 
D3 ARIMA(2,0,0)(2,0,2)[3] −430.01 −401.4 
D2 ARIMA(1,0,0)(1,0,0)[3] −286.61 −270.14 
D1 ARIMA(2,0,0)(2,0,0)[3] −471.1 −450.47 
SPI 6 A3 ARIMA(2,0,2)(2,0,2)[6] −397.54 −360.84 
D3 ARIMA(5,0,0)(1,0,2)[6] −512.96 −476.26 
D2 ARIMA(5,0,0) −717.34 −692.5 
D1 ARIMA(0,0,3)(2,0,1)[6] −799.14 −765.92 
SPI 9 A3 ARIMA(2,0,2)(1,0,0)[9] −374.35 −345.75 
D3 ARIMA(4,0,0)(1,0,0)[9] −429.45 −404.9 
D2 ARIMA(5,0,0) −717.16 −692.5 
D1 ARIMA(1,0,0)(1,0,0)[9] −258.23 −241.71 
SPI 12 A3 ARIMA(2,0,2)(2,0,0)[12] −408.84 −376.19 
D3 ARIMA(4,0,0)(1,0,0)[12] −485.18 −460.63 
D2 ARIMA(4,0,0)(1,0,0)[12] −728.84 −704.18 
D1 ARIMA(1,0,0)(0,0,2)[12] −262.85 −242.22 
Figure 9

Comparison of observed values with predicted values using the best SARIMA model for each SPI.

Figure 9

Comparison of observed values with predicted values using the best SARIMA model for each SPI.

Close modal

Proposed W-ARIMA model

To enhance forecasting accuracy, a combined model named the W-ARIMA model, integrating wavelet and ARIMA, is introduced to address the limitations of standalone ARIMA models when dealing with nonstationary data. In this approach, the DWT is applied to decompose the SPI series, generating suitable approximate and detailed components. Inverse wavelet transform is subsequently used to reconstruct the decomposed series. Optimal ARIMA/SARIMA models are fitted to each of the decomposed elements, and the W-ARIMA predictions are obtained by summing the forecasted values from all the decomposed components.

Each SPI series undergoes a three-level decomposition utilizing the DWT with the application of the Daubechies function of order 2 (db2). This decomposition results in the extraction of components including an approximate representation (A3) and various detail components (D3, D2, D1). Figure 10 illustrates the decomposition of SPI 3 into approximate and detailed components, provided as an example.
Figure 10

Wavelet decomposition of SPI 3.

Figure 10

Wavelet decomposition of SPI 3.

Close modal

The proposed W-ARIMA model is easily derived by aggregating the predicted values from each decomposed layer, utilizing the three iterative stages previously elucidated, along with the corresponding ARIMA model for each constituent subseries. For various ARIMA/SARIMA models applied to each decomposed SPI (A3, D3, D2, and D1) Timescales, the summary of optimal models based on AIC and SBC, is presented in Table 8. The most suitable model for each original SPI can be identified by selecting the models with the lowest AIC and SBC values.

Nonetheless, the comparison of predicted values versus actual data in Figures 1114 substantiates the efficacy of the proposed model for forecasting SPI series in the context of drought. However, a more comprehensive evaluation of forecasting accuracy from a scientific perspective can be achieved through the utilization of statistical measurements. To evaluate the experimental performances, a variety of evaluation metrics, including RMSE, MAPE, MAE, R2, NSE, and KGE, are utilized in this study. The efficiency of the ARIMA model is compared with that of the W-ARIMA model, and the forecasting results for both ARIMA and W-ARIMA, looking 1 year (12 months) ahead for each SPI series, are presented in Table 9 and Figure 15, respectively.
Table 9

Comparative accuracy analysis of ARIMA and W-ARIMA models for each SPI

Evaluation metricesARIMA
W-ARIMA
SPI 3SPI6SPI 9SPI 12SPI 3SPI 6SPI 9SPI 12
RMSE 0.6380 0.4275 0.3487 0.2474 0.5212 0.2148 0.2160 0.1466 
MAE 0.4734 0.3160 0.2529 0.1831 0.4134 0.1688 0.1555 0.1040 
MAPE 122.7013 184.5775 83.0959 55.6354 118.145 75.8269 71.7270  32.9000 
R2 0.5732 0.8066 0.8721 0.9414 0.7155 0.9507 09508 0.9782 
PBias 0.2699 −0.1087 −0.4325 −0.0833 0.8830 −0.3487 −0.8091 0.0759 
NSE 0.5723 0.8043 0.8706 0.9379 0.7146 0.9506 0.9504 0.9782 
KGE 0.6630 0.8190 0.8760 0.9060 0.7580 0.9710 0.9470 0.9810 
Evaluation metricesARIMA
W-ARIMA
SPI 3SPI6SPI 9SPI 12SPI 3SPI 6SPI 9SPI 12
RMSE 0.6380 0.4275 0.3487 0.2474 0.5212 0.2148 0.2160 0.1466 
MAE 0.4734 0.3160 0.2529 0.1831 0.4134 0.1688 0.1555 0.1040 
MAPE 122.7013 184.5775 83.0959 55.6354 118.145 75.8269 71.7270  32.9000 
R2 0.5732 0.8066 0.8721 0.9414 0.7155 0.9507 09508 0.9782 
PBias 0.2699 −0.1087 −0.4325 −0.0833 0.8830 −0.3487 −0.8091 0.0759 
NSE 0.5723 0.8043 0.8706 0.9379 0.7146 0.9506 0.9504 0.9782 
KGE 0.6630 0.8190 0.8760 0.9060 0.7580 0.9710 0.9470 0.9810 
Figure 11

Actual versus predicted values of A3, D3, D2, and D1 for SPI 3 through W-ARIMA.

Figure 11

Actual versus predicted values of A3, D3, D2, and D1 for SPI 3 through W-ARIMA.

Close modal
Figure 12

Actual versus predicted values of A3, D3, D2, and D1 for SPI 6 through W-ARIMA.

Figure 12

Actual versus predicted values of A3, D3, D2, and D1 for SPI 6 through W-ARIMA.

Close modal
Figure 13

Actual versus predicted values of A3, D3, D2, and D1 for SPI 9 through W-ARIMA.

Figure 13

Actual versus predicted values of A3, D3, D2, and D1 for SPI 9 through W-ARIMA.

Close modal
Figure 14

Actual versus predicted values of A3, D3, D2, and D1 for SPI 12 through W-ARIMA.

Figure 14

Actual versus predicted values of A3, D3, D2, and D1 for SPI 12 through W-ARIMA.

Close modal
Figure 15

Comparison of SPI forecasting of ARIMA versus W-ARIMA model for 1 year ahead.

Figure 15

Comparison of SPI forecasting of ARIMA versus W-ARIMA model for 1 year ahead.

Close modal

The comparison table (Table 9) assessing the accuracy of drought forecasting between the ARIMA and W-ARIMA models reveals notable insights. Various evaluation metrics, such as R2, RMSE, MAPE, NSE, KGE, and PBIAS, were employed to comprehensively evaluate their performances. In terms of R2, RMSE, MAPE, NSE, and KGE, the W-ARIMA model consistently demonstrated superiority over the ARIMA model, showcasing its enhanced predictive capability and precision. These metrics collectively indicate that the W-ARIMA model outperforms ARIMA across different dimensions of forecasting accuracy. However, it is noteworthy that for the PBIAS criterion, no significant difference between the two models was observed, implying a comparable performance in addressing bias. This comprehensive assessment underscores the favorable attributes of the W-ARIMA approach in improving the precision and reliability of drought forecasting compared to the traditional ARIMA method.

Figure 16 illustrates a comparative analysis between the ARIMA and W-ARIMA models for different SPI intervals (SPI 3, 6, 9, and 12) in terms of predictive results for a 1-year forecast. The evaluation encompasses several performance metrics. The RMSE, MAE, R2, and NSE metrics demonstrate the W-ARIMA model has the edge in drought forecasting. In all SPI intervals, W-ARIMA consistently produces more accurate forecasts, as evidenced by consistently lower RMSE and MAE values. The W-ARIMA model also exhibits higher R2 values, which indicate a stronger relationship between forecasted and observed SPI values, as well as higher NSE values, which indicate a better fit to the data in general. W-ARIMA is therefore a better model for capturing SPI patterns and making drought forecasts more accurate and reliable than traditional ARIMA.
Figure 16

Evaluation performance indices for prediction results in ARIMA and W-ARIMA models.

Figure 16

Evaluation performance indices for prediction results in ARIMA and W-ARIMA models.

Close modal

In the realm of drought forecasting, our study's hybrid W-ARIMA model tailored to Kabul, Afghanistan, emerges as a groundbreaking advance. Leveraging original data, we pioneer a statistical and data-driven approach, marking the first academic effort of its kind within Afghanistan. Thorough evaluation, encompassing diverse metrics excluding RMSE and R2 underscores the hybrid model's pronounced edge over traditional ARIMA. Distinctively, our work distinguishes itself by its localized focus on Kabul, serving as a trailblazer in introducing sophisticated methodologies for drought prediction in this region. This unique contribution augments both academic discourse and the practical domain, promising enhanced resilience strategies for managing drought-related challenges in Kabul and beyond.

In this study, we introduced a novel hybrid W-ARIMA model and applied it to forecast drought occurrences in Kabul, Afghanistan, addressing the critical issue of drought forecasting. Our innovative utilization of data-driven methods for drought prediction in Afghanistan is further enhanced by employing an original dataset. By adapting the well-established W-ARIMA model to Kabul's distinct hydro-climatic conditions, we extend its applicability and contribute to evolving drought prediction techniques in the region.

A central contribution of our research is the comprehensive comparison between the proposed W-ARIMA model and the traditional ARIMA model based on SPI. Through a comprehensive analysis using various performance metrics, including RMSE, MAE, MAPE, R2, NSE, PBIAS, and KGE, the clear superiority of the W-ARIMA approach becomes evident. Across all metrics, except PBIAS, the W-ARIMA model consistently outperforms the traditional ARIMA model, underscoring its potential to enhance drought forecast accuracy.

Initially, we establish the optimal ARIMA/SARIMA model for each SPI series, serving as a benchmark. Subsequently, the DWT (db2) is applied to decompose each SPI series, capturing the essential multiscale information required for accurate forecasting. The final W-ARIMA forecast for each SPI series is derived by fitting suitable ARIMA models to the decomposed elements (A3, D3, D2, and D1). W-ARIMA model can be obtained by summing all predicted values of each decomposed layer using the three iterative stages explained in methodology with the corresponding ARIMA model for each constituent subseries.

Ultimately, our study pioneers the application of an advanced forecasting model for drought prediction in Kabul, Afghanistan. The proven effectiveness of the W-ARIMA model, coupled with a comprehensive comparison to the SPI-based ARIMA model, highlights its potential to enhance the precision of drought forecasting. Emphasizing localized approaches to improve forecasting accuracy, our work contributes to the broader field of climate resilience strategies. In comparison to prior research, our study represents a significant advancement in drought forecasting, extending the application of the W-ARIMA model to Kabul's unique context and providing a blueprint for similar regions facing comparable challenges. By incorporating the DWT and the SPI framework, our work showcases the potential of innovative methodologies that bridge traditional hydrological modeling with contemporary data-driven techniques, thus enriching the discourse on climate adaptation strategies.

The first author would like to express gratitude to Afghanistan's Ministry of Higher Education (MoHE) for the scholarship and Kabul Education University (KEU) for the study leave.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Akaike
H.
1974
A new look at the statistical model identification
.
IEEE Transactions on Automatic Control
19
(
6
).
https://doi.org/10.1109/TAC.1974.1100705.
Al Wadia
M. T. I. S.
&
Ismail
M. T.
2011
Selecting wavelet transforms model in forecasting financial time series data based on ARIMA model
.
Applied Mathematical Sciences
5
(
7
),
315
326
.
Belayneh
A.
&
Adamowski
J.
2012
Standard precipitation index drought forecasting using neural networks, wavelet neural networks, and support vector regression
.
Applied Computational Intelligence and Soft Computing
2012
,
1
13
.
https://doi.org/10.1155/2012/794061.
Belayneh
A.
&
Adamowski
J.
2013
Drought forecasting using new machine learning methods/Prognozowanie suszy z wykorzystaniem automatycznych samouczących się metod
.
Journal of Water and Land Development
18
(
9
),
3
12
.
https://doi.org/10.2478/jwld-2013-0001
.
Belayneh
A.
,
Adamowski
J.
,
Khalil
B.
&
Ozga-Zielinski
B.
2014
Long-term SPI drought forecasting in the Awash River Basin in Ethiopia using wavelet neural network and wavelet support vector regression models
.
Journal of Hydrology
508
.
https://doi.org/10.1016/j.jhydrol.2013.10.052.
Belayneh
A.
,
Adamowski
J.
&
Khalil
B.
2016
Short-term SPI drought forecasting in the Awash River Basin in Ethiopia using wavelet transforms and machine learning methods
.
Sustainable Water Resources Management
2
(
1
).
https://doi.org/10.1007/s40899-015-0040-5
.
Benaouda
D.
,
Murtagh
F.
,
Starck
J.-L.
&
Renaud
O.
2006
Wavelet-based nonlinear multiscale decomposition model for electricity load forecasting
.
Neurocomputing
70
(
1–3
),
139
154
.
https://doi.org/10.1016/j.neucom.2006.04.005
.
Box
G. E.
&
Jenkins
G.
1976
Time Series Analysis: Forecasting and Control
.
Holden-Day
,
San Francisco
.
Box
G. E. P.
,
Jenkins
G. M.
&
Reinsel
G. C.
1994
Time Series Analysis: Forecasting and Control
, 3rd edn.
Prentice Hall
,
Englewood Cliffs, NJ
.
Buyukyildiz
M.
,
Tezel
G.
&
Yilmaz
V.
2014
Estimation of the change in lake water level by artificial intelligence methods
.
Water Resources Management
28
(
13
),
4747
4763
.
https://doi.org/10.1007/s11269-014-0773-1
.
Cacciamani
C.
,
Morgillo
A.
,
Marchesi
S.
&
Pavan
V.
2007
Monitoring and forecasting drought on a regional scale: Emilia-Romagna Region
. In:
Methods and Tools for Drought Analysis and Management
.
Springer
,
Netherlands
, pp.
29
48
.
https://doi.org/10.1007/978-1-4020-5924-7_2
.
Che
J.
&
Zhai
H.
2022
WT-ARIMA combination modelling for short-term load forecasting
.
IAENG International Journal of Computer Science
49
(
2
), 542–548.
Danandeh Mehr
A.
,
Kahya
E.
&
Özger
M.
2014
A gene-wavelet model for long lead time drought forecasting
.
Journal of Hydrology
517
.
https://doi.org/10.1016/j.jhydrol.2014.06.012.
Dawson
C. W.
,
Abrahart
R. J.
&
See
L. M.
2007
Hydrotest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts
.
Environmental Modelling & Software
22
(
7
),
1034
1052
.
https://doi.org/10.1016/j.envsoft.2006.06.008
.
Deo
R. C.
,
Tiwari
M. K.
,
Adamowski
J. F.
&
Quilty
J. M.
2017
Forecasting effective drought index using a wavelet extreme learning machine (W-ELM) model
.
Stochastic Environmental Research and Risk Assessment
31
(
5
).
https://doi.org/10.1007/s00477-016-1265-z.
Djerbouai
S.
&
Souag-Gamane
D.
2016
Drought forecasting using neural networks, wavelet neural networks, and stochastic models: case of the Algerois Basin in North Algeria
.
Water Resources Management
30
(
7
),
2445
2464
.
https://doi.org/10.1007/s11269-016-1298-6
.
Eslamian
S.
,
Ostad-Ali-Akbari
K.
,
Singh
V. P.
&
Dalezios
N. R.
2017
A review of drought indices
.
International Journal of Constructive Research in Civil Engineering
3
(
4
).
https://doi.org/10.20431/2454-8693.0304005
.
Guttman
N. B.
1998
Comparing the palmer drought index and the standardized precipitation index
.
Journal of the American Water Resources Association
34
(
1
).
https://doi.org/10.1111/j.1752-1688.1998.tb05964.x
.
Guttman
N. B.
1999
Accepting the standardized precipitation index: a calculation algorithm
.
JAWRA Journal of the American Water Resources Association
35
(
2
),
311
322
.
https://doi.org/10.1111/j.1752-1688.1999.tb03592.x
.
Hayes
M.
,
Svoboda
M.
,
Wall
N.
&
Widhalm
M.
2011
The Lincoln declaration on drought indices: universal meteorological drought index recommended
.
Bulletin of the American Meteorological Society
92
(
4
).
https://doi.org/10.1175/2010BAMS3103.1
.
Khan
M. M. H.
,
Muhammad
N. S.
&
El-Shafie
A.
2020
Wavelet based hybrid ANN-ARIMA models for meteorological drought forecasting
.
Journal of Hydrology
590
,
125380
.
https://doi.org/10.1016/j.jhydrol.2020.125380
.
Kim
T.-W.
&
Valdés
J. B.
2003
Nonlinear model for drought forecasting based on a conjunction of wavelet transforms and neural networks
.
Journal of Hydrologic Engineering
8
(
6
).
https://doi.org/10.1061/(ASCE)1084-0699(2003)8:6(319)
.
Koycegiz
C.
&
Buyukyildiz
M.
2022
Investigation of precipitation and extreme indices spatiotemporal variability in Seyhan Basin, Turkey
.
Water Supply
22
(
12
),
8603
8624
.
https://doi.org/10.2166/ws.2022.391
.
Koycegiz
C.
&
Buyukyildiz
M.
2023
Investigation of spatiotemporal variability of some precipitation indices in Seyhan Basin, Turkey: monotonic and sub-trend analysis
.
Natural Hazards
116
(
2
),
2211
2244
.
https://doi.org/10.1007/s11069-022-05761-6
.
Kriechbaumer
T.
,
Angus
A.
,
Parsons
D.
&
Rivas Casado
M.
2014
An improved wavelet–ARIMA approach for forecasting metal prices
.
Resources Policy
39
,
32
41
.
https://doi.org/10.1016/j.resourpol.2013.10.005
.
Mahmud
I.
,
Bari
S. H.
&
Rahman
M. T. U.
2016
Monthly rainfall forecast of Bangladesh using autoregressive integrated moving average method
.
Environmental Engineering Research
22
(
2
).
https://doi.org/10.4491/eer.2016.075
.
Mallat
S. G.
1989
A theory for multiresolution signal decomposition: the wavelet representation
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
11
(
7
),
674
693
.
https://doi.org/10.1109/34.192463
.
Mayar
M. A.
2021
Droughts on the Horizon: Can Afghanistan Manage This Risk?
Afghanistan-Analysts.Org
,
Kabul, Afghanistan
.
McKee
T. B.
&
Doesken
N. J.
, &
Kleist
J.
1995
Drought monitoring with multiple time scales
. In:
Ninth Conference on Applied Climatology, American Meteorological Society
.
January 1995, Dallas, TX
. pp.
233
236
.
Mishra
A. K.
&
Desai
V. R.
2005
Drought forecasting using stochastic models
.
Stochastic Environmental Research and Risk Assessment
19
(
5
),
326
339
.
https://doi.org/10.1007/s00477-005-0238-4
.
Mishra
A. K.
&
Singh
V. P.
2010
A review of drought concepts
.
Journal of Hydrology
391
(
1–2
),
202
216
.
https://doi.org/10.1016/j.jhydrol.2010.07.012
.
Moosavi
V.
,
Vafakhah
M.
,
Shirmohammadi
B.
&
Behnia
N.
2013
A wavelet-ANFIS hybrid model for groundwater level forecasting for different prediction periods
.
Water Resources Management
27
(
5
),
1301
1321
.
https://doi.org/10.1007/s11269-012-0239-2
.
Moriasi
D. N.
,
Arnold
J. G.
,
Van Liew
M. W.
,
Bingner
R. L.
,
Harmel
R. D.
&
Veith
T. L.
2007
Model evaluation guidelines for systematic quantification of accuracy in watershed simulations
.
Transactions of the ASABE
50
(
3
),
885
900
.
https://doi.org/10.13031/2013.23153
.
Nourani
V.
,
Alami
M. T.
&
Aminfar
M. H.
2009
A combined neural-wavelet model for prediction of Ligvanchai watershed precipitation
.
Engineering Applications of Artificial Intelligence
22
(
3
),
466
472
.
https://doi.org/10.1016/j.engappai.2008.09.003
.
Nourani
V.
,
Hosseini Baghanam
A.
,
Adamowski
J.
&
Kisi
O.
2014
Applications of hybrid wavelet–artificial intelligence models in hydrology: a review
.
Journal of Hydrology
514
,
358
377
.
https://doi.org/10.1016/j.jhydrol.2014.03.057
.
Ntale
H. K.
&
Gan
T. Y.
2003
Drought indices and their application to East Africa
.
International Journal of Climatology
23
(
11
),
1335
1357
.
https://doi.org/10.1002/joc.931
.
Nury
A. H.
,
Hasan
K.
&
Alam
M. J. B.
2017
Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh
.
Journal of King Saud University – Science
29
(
1
),
47
61
.
https://doi.org/10.1016/j.jksus.2015.12.002
.
Partal
T.
&
Kişi
Ö
.
2007
Wavelet and neuro-fuzzy conjunction model for precipitation forecasting
.
Journal of Hydrology
342
(
1–2
),
199
212
.
https://doi.org/10.1016/j.jhydrol.2007.05.026
.
Rahman
M. A.
,
Yunsheng
L.
&
Sultana
N.
2017
Analysis and prediction of rainfall trends over Bangladesh using Mann–Kendall, spearman's rho tests and ARIMA model
.
Meteorology and Atmospheric Physics
129
(
4
).
https://doi.org/10.1007/s00703-016-0479-4
.
Rathinasamy
M.
,
Khosa
R.
,
Adamowski
J.
,
Partheepan
S. c. G.
,
Anand
J.
&
Narsimlu
B.
2014
Wavelet-based multiscale performance analysis: an approach to assess and improve hydrological models
.
Water Resources Research
50
(
12
),
9721
9737
.
https://doi.org/10.1002/2013WR014650
.
Schwarz
G.
1978
Estimating the dimension of a model
.
The Annals of Statistics
6
(
2
),
461
464
.
Shabri
A.
2014
A hybrid wavelet analysis and adaptive neuro-fuzzy inference system for drought forecasting
.
Applied Mathematical Sciences
8
,
6909
6918
.
https://doi.org/10.12988/ams.2014.48263
.
Shabri
A.
2015
A hybrid model for stream flow forecasting using wavelet and least squares support vector machines
.
Jurnal Teknologi
73
(
1
).
https://doi.org/10.11113/jt.v73.3380.
Soh
Y. W.
,
Koo
C. H.
,
Huang
Y. F.
&
Fung
K. F.
2018
Application of artificial intelligence models for the prediction of standardized precipitation evapotranspiration index (SPEI) at Langat River Basin, Malaysia
.
Computers and Electronics in Agriculture
144
.
https://doi.org/10.1016/j.compag.2017.12.002
.
Soltani
S.
2002
On the use of the wavelet decomposition for time series prediction
.
Neurocomputing
48
(
1–4
),
267
277
.
https://doi.org/10.1016/S0925-2312(01)00648-8
.
Sönmez
F. K.
,
Kömüscü
A. Ü.
,
Erkan
A.
&
Turgu
E.
2005
An analysis of spatial and temporal dimension of drought vulnerability in Turkey using the standardized precipitation index
.
Natural Hazards
35
(
2
),
243
264
.
https://doi.org/10.1007/s11069-004-5704-7
.
Tongal
H.
2013
Nonlinear forecasting of stream flows using a chaotic approach and artificial neural networks
.
Earth Sciences Research Journal
17
(
2
),
119
126
.
Venkata Ramana
R.
,
Krishna
B.
,
Kumar
S. R.
&
Pandey
N. G.
2013
Monthly rainfall prediction using wavelet neural network analysis
.
Water Resources Management
27
(
10
),
3697
3711
.
https://doi.org/10.1007/s11269-013-0374-4
.
Wang
W.
&
Ding
J.
2003
Wavelet network model and its application to the prediction of hydrology
.
Nature and Science
1
(
1
),
67
71
.
Widowati
,
Putro
S. P.
,
Koshio
S.
&
Oktaferdian
V.
2016
Implementation of ARIMA model to asses seasonal variability macrobenthic assemblages
.
Aquatic Procedia
7
.
https://doi.org/10.1016/j.aqpro.2016.07.039.
Wilhite
D. A.
2005
Drought and Water Crises
.
CRC Press
, Boca Raton, FL.
https://doi.org/10.1201/9781420028386.
Wu
H.
,
Hubbard
K. G.
&
Wilhite
D. A.
2004
An agricultural drought risk-assessment model for corn and soybeans
.
International Journal of Climatology
24
(
6
),
723
741
.
https://doi.org/10.1002/joc.1028
.
Wu
X.
,
Zhou
J.
,
Yu
H.
,
Liu
D.
,
Xie
K.
,
Chen
Y.
,
Hu
J.
,
Sun
H.
&
Xing
F.
2021
The development of a hybrid wavelet-ARIMA-LSTM model for precipitation amounts and drought analysis
.
Atmosphere
12
(
1
),
74
.
https://doi.org/10.3390/atmos12010074
.
Yeh
H. F.
&
Hsu
H. L.
2019
Stochastic model for drought forecasting in the Southern Taiwan basin
.
Water (Switzerland)
11
(
10
).
https://doi.org/10.3390/w11102041.
Yihdego
Y.
,
Vaheddoost
B.
&
Al-Weshah
R. A.
2019
Drought indices and indicators revisited
.
Arabian Journal of Geosciences
12
(
3
).
https://doi.org/10.1007/s12517-019-4237-z.
Zargar
A.
,
Sadiq
R.
,
Naser
B.
&
Khan
F. I.
2011
A review of drought indices
.
Environmental Reviews
19
(
NA
).
https://doi.org/10.1139/a11-013
.
Zhang
G. P.
2003
Time series forecasting using a hybrid ARIMA and neural network model
.
Neurocomputing
50
,
159
175
.
https://doi.org/10.1016/S0925-2312(01)00702-0
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data