## Abstract

Drought is a natural hazard that is characterized by a low amount of precipitation in a region. In order to evaluate the drought-related issues that cause chaos for human well-being, drought indices have become increasingly important. In this study, the monthly precipitation data from 1964 to 2013 (about 50 years) of the Jodhpur district in the drought-prone Rajasthan state of India was used to derive the effective drought index (EDI). The machine learning models hybridized with evolutionary optimizers such as the genetic algorithm adaptive neuro-fuzzy inference system (GA-ANFIS) and particle swarm optimization ANFIS (PSO-ANFIS) were used in addition to the generalized regression neural network (GRNN) to predict the EDI index. Using the partial autocorrelation function (PACF), models for forecasting the monthly EDI were constructed with 2-, 3- and 5-input combinations to evaluate their outcomes based on various performance indices. The results of the different combination models were compared. With reference to 2-input and 3-input combination models, both GA-ANFIS and PSO-ANFIS show better performance results with *R*^{2} = 0.75, while among the models with 5-input combination, GA-ANFIS depicts better performance results compared to other models with *R*^{2} = 0.78. The results are presented suitably with the aid of scatter plots, Taylor's diagram and violin plots. Overall, the GA-ANFIS and PSO-ANFIS models outperformed the GRNN model.

## HIGHLIGHTS

Effective drought index (EDI) was predicted using soft computing techniques.

Hybrid machine learning algorithms were used.

GA-ANFIS, PSO-ANFIS and GRNN paradigms were used.

The EDI of an arid region in India was used for prediction.

Precipitation data was used for computing the EDI of drought-prone areas.

## INTRODUCTION

Drought is a recurrent extreme climate condition over land that has varying timescales. The primary purpose of several droughts, namely perpetual, seasonal and consequential, is dependent on the accumulation of rainfall deficits over various timescales (Thornthwaite 1948). Drought indices, which are designed to provide a concise overall picture of droughts, are frequently derived from massive amounts of hydro-climatic data and are used for making decisions on water resource management and water allocations to mitigate the impact of drought. The use of quantitative drought indices for drought management should, ideally, reduce decision-makers' subjective preferences. A variety of drought definitions have been used in the past (Gibbs 1975; Wilhite & Glantz 1985; Ntale & Gan 2003). Indices are the representations to measure the drought severity, using climatic or hydro-meteorological inputs. Understanding the fundamental concepts of drought, drought categorization, drought indices and historical droughts is crucial for developing measures and reducing the impacts of droughts (Mishra & Singh 2010).

Various researchers have carried out a number of drought studies, the majority of which have utilized a variety of machine learning algorithms and focused on hydrological and meteorological droughts. In the current scenario, forecasting droughts have become crucial in numerous locations. Different machine learning techniques have been used to determine drought forecasting. Water resource management relies largely on drought forecasting to minimize adverse effects (Fung *et al.* 2020). Hence, effective management of water resources is crucial in the arid and semi-arid regions (Omar *et al.* 2021). The management of water resources, the scheduling of irrigation and eco-environmental management all depend on accurate drought forecasting (Boken *et al.* 2005; Wambua *et al.* 2014; Basak *et al.* 2022) and also the recent climate change scenario can enhance the threat of water quantity in various sectors (Jee *et al.* 2019). Drought indicators like rainfall, temperature, streamflow, ground and reservoir water levels, soil moisture and snowpack that are calculated using dynamic or statistical models are typically the foundation for drought forecasting (Keyantash & Dracup 2002; Yoon *et al.* 2012; Belayneh *et al.* 2014) of drought-related variables (Mishra *et al.* 2009; Madadgar & Moradkhani 2014).

Worldwide, there are over 150 used drought indices and indicators (Zargar *et al.* 2011; Halwatura *et al.* 2017) of which approximately 40 are frequently utilized as research tools in drought-prone regions (Hakam *et al.* 2022). Drought indices were formulated to be useful in detecting the onset of drought based on spatial and temporal variability (Morid *et al.* 2006). An effective drought index (EDI) was developed by Byun & Wilhite (1999) in order to overcome the standardized precipitation index Standardized Precipitation Index (SPI) shortcomings and is an intensive measure for forecasting drought. A meteorological and hydrological drought index can be used in long-term drought forecasting and early warning systems (Edossa *et al.* 2010). Assessing the negative impacts of drought occurrences on essential water resources, agriculture, ecosystems and hydrology is made easier by using the EDI to predict drought (Deo & Şahin 2015). The evaluation of drought impact using meteorological and vegetation indices demonstrates that the integrated analysis of ground-measured data and satellite data has a high potential for drought study (Jain *et al.* 2010). The meteorological drought indices, which evaluate the dependability under changing climatic conditions, describe a good correlation for all time steps (Djellouli *et al.* 2016). The comparative study of drought indices such as EDI as well as Standardized Precipitation Index SPI can be used to represent the drought events (Huang *et al.* 2016). According to recent studies (Byun & Kim 2010; Li *et al.* 2019), the EDI is superior to Standardized Precipitation Index SPI at simulating the occurrence of drought gradually. The EDI assesses drought more precisely than many other indices, and it is more specific and consistent in the study of drought (Wambua *et al.* 2018). A study in different climate regime like temperate, semi-arid, dry and sub-humid regions have reported that the EDI is the best drought index and a better preference for drought events (Jain *et al.* 2015).

Machine learning algorithms are favourable for drought prediction as they are less complicated than dynamic or physical models and take up less time (Mokhtar *et al.* 2021). A study conducted by Khosravi *et al.* (2017*)* employs machine learning algorithms to model the time-series behaviour of the meteorological drought indices. Under the climate change projections, machine learning algorithms have become more reliable with the outcome being able to reflect the variability of drought indices (Ahmadalipour *et al.* 2017). In drought prediction, soft computing methods, particularly hybrid models, perform better than traditional models (Başakın *et al.* 2021). The use of hybrid models of machine learning algorithms for drought prediction could assist in taking the necessary measures to lessen the effects of drought in various locations (Malik *et al.* 2021).

Inputs/variables, such as rainfall, potential evapotranspiration, storage reservoir volume, stream flow and soil moisture content, are used to assess the effectiveness of drought forecasting models (Barua *et al.* 2012). The neural networks and stochastic models were developed for drought forecasting and their models were compared (Morid *et al.* 2007; Barua *et al.* 2012). Data-driven models for forecasting drought using machine learning techniques are more effective and produce better results regardless of various places in different environments (Belayneh & Adamowski 2012, 2013). Also utilizing the lags of precipitation, temperature and drought-related input datasets may help in improving the efficiency of the models in drought forecasting (Azimi *et al.* 2022). Machine learning models can effectively forecast drought with multiple drought-related attributes from precipitation, surface discharge, and satellite-derived land cover indices (Tan & Perkowski 2015). Machine learning algorithms provide a precise forecasting of drought incidence and its key parameters are a significant challenge to improve the drought study in developing the mitigation strategies (Mishra & Nagarajan 2012). Various machine learning models could be a suitable drought forecasting tool using the rainfall timeseries data in arid to semi-arid regions in order to ensure sustainable water resource planning and developed drought preparedness (Mishra & Desai 2005; Shatanawi *et al.* 2013).

The EDI has been used in limited studies and Rajasthan, one of the driest states in India has not been explored using soft computing techniques in areas with drought or climate extremes. Hence, the present study is aimed to forecast regional drought across six stations within the Jodhpur district of Rajasthan state, India. The models based on an adaptive neuro-fuzzy inference system (ANFIS) optimized by nature-based optimizers were applied for time-series forecasting of monthly EDI. The genetic algorithm (GA) and particle swarm optimizer (PSO) were used to optimize the premise and resulting parameters of ANFIS. Both algorithms have demonstrated their ability to search for optimal global solutions. A generalized regression neural network (GRNN) modelling was also adopted for comparative performance evaluation with ANFIS-based models.

## MATERIALS AND METHODS

### Study area and data description

Jodhpur is situated in the western part of Rajasthan state. The area covered by Jodhpur district comprises about 11.60% of Rajasthan's total area, which covers some areas in the Thar Desert, India. Jodhpur district is located between 26° and 27°37′ North Latitude and 72°55′ and 73°52′ East Longitude, having a geographical area of 22,850 km^{2}. It is situated at a height of 250–300 m above the mean sea level. The climate of Jodhpur is hot and arid. From March to October, it is extremely hot, and it remains almost dry throughout the year. There is a brief rainfall from late June to September (Koppen Climate classification system).

Station . | Location . | Max (mm) . | Min (mm) . | Mean (mm) . | Standard deviation . | Coefficient of variance . |
---|---|---|---|---|---|---|

Jodhpur | 26°18′N 73°19′47.99″E | 376.86 | 0.00 | 30.36 | 59.02 | 1.94 |

Osian | 26°43′1.19″N 72°55′1.2″E | 296.12 | 0.00 | 27.88 | 51.70 | 1.85 |

Phalodi | 27°75′8.79″N 72°22′1.2″E | 282.43 | 0.00 | 19.70 | 39.02 | 1.98 |

Bilara | 26°10′58.79″N 73°42′E | 495.04 | 0.00 | 34.52 | 67.63 | 1.96 |

Shergarh | 26°19′58.79″N 72°17′59.99″E | 313.72 | 0.00 | 21.55 | 44.45 | 2.06 |

Pipar Road | 26°19′58.79″N73°30′E | 705.32 | 0.00 | 34.48 | 70.14 | 2.03 |

Station . | Location . | Max (mm) . | Min (mm) . | Mean (mm) . | Standard deviation . | Coefficient of variance . |
---|---|---|---|---|---|---|

Jodhpur | 26°18′N 73°19′47.99″E | 376.86 | 0.00 | 30.36 | 59.02 | 1.94 |

Osian | 26°43′1.19″N 72°55′1.2″E | 296.12 | 0.00 | 27.88 | 51.70 | 1.85 |

Phalodi | 27°75′8.79″N 72°22′1.2″E | 282.43 | 0.00 | 19.70 | 39.02 | 1.98 |

Bilara | 26°10′58.79″N 73°42′E | 495.04 | 0.00 | 34.52 | 67.63 | 1.96 |

Shergarh | 26°19′58.79″N 72°17′59.99″E | 313.72 | 0.00 | 21.55 | 44.45 | 2.06 |

Pipar Road | 26°19′58.79″N73°30′E | 705.32 | 0.00 | 34.48 | 70.14 | 2.03 |

### Methods

#### Computation of EDI

The EDI is a vigorous index that studies water accumulation with a weighing parameter applied to precipitation data with time accordingly to analyze drought. It is one of the indices that quantifies droughts in terms of droughts classes composed of positive and negative values where positive values indicate wetness and negative values indicate dryness (Byun & Wilhite 1999). The EDI for monthly precipitation is calculated following Byun and Wilhite:

- i.
- ii.
- iii.
- iv.

The limits of range from less than −2 to more than +2 where negative values represent dryness and positive values represent wetness. The for dryness has been categorized into four:

- i.
Extreme drought ≤− 2

- ii.
Severe drought <− 2 to ≤− 1.5

- iii.
Moderate drought <− 1.5 to ≤− 1

- iv.
Normal drought <− 1 to ≤+ 1

#### Model development and performance analysis

##### Adaptive neuro-fuzzy inference system

ANFIS is a simple data learning approach that models human knowledge and reasoning processes. ANFIS is a learning technique that converts inputs to outputs through fuzzy logic and highly interconnected neural networks. An ANFIS uses two approaches – a neural network and fuzzy logic and these two systems are combined to collectively achieve a good result (Mehrabi *et al.* 2012). ANFIS is one of the robust artificial intelligence algorithms developed by Jang (1993) and creates a fuzzy inference system with adaptive capabilities. It combines the interpretability of fuzzy logic with the learning and approximation capabilities of neural networks. By assigning a structure to model uncertainty that corresponds to the human way of thinking, reasoning and perceiving, the ANFIS structure is arranged so that they relate to each other by a set of rules. The parameters of ANFIS were optimized using GA and particle swarm optimization (PSO).

##### Genetic algorithm

GA is a stochastic search computational process which is based on the evolutionary theory of natural selection and genetics, which gives favourable or near-optimal solutions for combinatorial optimization problems, such as travelling salesman problems, scheduling problems, heuristic search or process planning problems (Goldberg 1989). GA's basic concept is to implement genetic operators such as crossover, mutation, selection for up-gradation and search for the best population by imitating the natural evolution process artificially.

The advantages of GA include (1) fast convergence to near-global optimum, (2) superior global searching capability in the space which has a complex searching surface and (3) applicability to the searching space where one cannot use gradient information of the space. Hence, the present study aims to search for the meta-parameters of ANFIS using a GA. The use of GA as an optimizing technique is advantageous over other optimizing techniques. It also supports multi-objective optimization, proves to be useful for noisy data, gets better with time and is inherently parallel and quickly distributed. GA also has an easy and flexible solution for hybrid applications.

##### Particle swarm optimizer

PSO technique was developed by Kennedy & Eberhart (1995). PSO is a population-based stochastic optimization technique. PSO shares some similarities with the GA technique. PSO is an evolutionary algorithm and follows a societal animal's activities that do not have a particular group leader. This computational method optimizes some problem constraints to enhance any feasible sets regarding measurements using iterations. The PSO consists of a swarm of particles where each particle is represented by a potential solution (Omran *et al.* 2005) and is influenced by the velocity. On comparing with the other computational technique, there are only a few parameters to adjust in PSO. Hence, it is easy to implement. PSO has been widely used in function optimization, artificial neural network training, fuzzy system control and GA areas.

##### Generalized regression neural network

*et al.*2018). GRNN comprises the input, hidden, summation, division layer and output, as shown in Figure 2.

GRNN is applicable in the modelling of dynamic plants (Seng *et al.* 2002), aerodynamic forces prediction (Yang & Ko 1996; Yao *et al.* 2012), solar photovoltaic power forecast (Alhakeem *et al.* 2015), predictive modelling of non-linear systems (Song & Ren 2005).

### Model input combinations

where *P*_{t} denotes the EDI at a given month ‘t’.

The GA-ANFIS, PSO-ANFIS and GRNN models were developed using MATLAB (2015). The model is trained by utilizing 70% of the data (1964–1998), and the rest 30% of the data (1999–2013) was used for testing. Optimal input combination was identified by GA-ANFIS, PSO-ANFIS and GRNN models. The developed GA-ANFIS and PSO-ANFIS models are trained using the optimum values of the Fuzzy C-Means (FCM) Clustering. The model is then again simulated using testing data and evaluation is done. The predicted values obtained from the GA-ANFIS, PSO-ANFIS and GRNN models are then compared with the known observed values. The different parameters involved in GA-ANFIS are maximum iteration, population size, crossover percentage, mutation percentage, mutation rate, selection process and number of clusters. In PSO-ANFIS, the parameters are maximum iteration, population size, damping ratio, inertia weight and number of clusters. In GRNN, an important parameter is the *i*-value, which indicates the best value found during the analysis.

### Various statistical indices assessment of the models

Different statistical indices were used to assess the performance of all the models, namely NRMSE, NNSE, MAE and NMB.

- i.
- ii.
- iii.

## RESULTS AND DISCUSSION

The study is performed for the six rainfall stations in Jodhpur, Rajasthan to forecast the EDI. The EDI has been computed using the monthly rainfall data. The outputs from the three machine learning models, i.e., GA-ANFIS, PSO-ANFIS and GRNN are assessed using various performance indices such as NRMSE, NNSE, MAE and NMB as represented in tables and various graphs. A comparison of the models in all six rainfall stations has been done using the performance metrics, where the results show that the hybrid models GA-ANFIS and PSO-ANFIS have better prediction accuracy as compared with the GRNN model. Tables 2 and 3 describe the parameter settings of GA-ANFIS, PSO-ANFIS and GRNN adopted for the analysis. The parameters of the models are set considering the requirement of dataset inputs.

Parameters . | GA-ANFIS . | Parameters . | PSO-ANFIS . |
---|---|---|---|

Maximum iteration | 1,000 | Maximum iteration | 1,000 |

Population size | 25 | Population size | 25 |

Cross over percentage | 0.4 | Damping ratio | 0.99 |

Mutation percentage | 0.7 | Inertia weight | 1 |

Mutation rate | 0.15 | Number of clusters | 2, 3, 5 |

Selection pressure, β | 8 | ||

No. of clusters | 2, 3, 5 |

Parameters . | GA-ANFIS . | Parameters . | PSO-ANFIS . |
---|---|---|---|

Maximum iteration | 1,000 | Maximum iteration | 1,000 |

Population size | 25 | Population size | 25 |

Cross over percentage | 0.4 | Damping ratio | 0.99 |

Mutation percentage | 0.7 | Inertia weight | 1 |

Mutation rate | 0.15 | Number of clusters | 2, 3, 5 |

Selection pressure, β | 8 | ||

No. of clusters | 2, 3, 5 |

Station inputs . | GRNN (i-value). | ||
---|---|---|---|

2 . | 3 . | 5 . | |

Jodhpur | 0.28 | 0.52 | 0.66 |

Osian | 0.24 | 0.37 | 0.54 |

Phalodi | 0.2 | 0.25 | 0.53 |

Bilara | 0.45 | 0.42 | 0.8 |

Shergarh | 0.27 | 0.29 | 0.51 |

Pipar Road | 0.2 | 0.19 | 0.46 |

Station inputs . | GRNN (i-value). | ||
---|---|---|---|

2 . | 3 . | 5 . | |

Jodhpur | 0.28 | 0.52 | 0.66 |

Osian | 0.24 | 0.37 | 0.54 |

Phalodi | 0.2 | 0.25 | 0.53 |

Bilara | 0.45 | 0.42 | 0.8 |

Shergarh | 0.27 | 0.29 | 0.51 |

Pipar Road | 0.2 | 0.19 | 0.46 |

Table 4 presents the performance assessment of the findings from the GA-ANFIS, PSO-ANFIS and GRNN. Considering the evaluation metrics like coefficient of determination (*R*^{2}), NRMSE, normalized Nash–Sutcliffe efficiency (NNSE) and mean absolute error (MAE), 5-input combination yields better outcomes for each of the three models (GA-ANFIS, PSO-ANFIS and GRNN). Comparing the results for each station, it was observed that for 2-input combination, Osian rainfall station portrays a better accuracy prediction with *R*^{2} = 0.75, NRMSE = 0.503, NNSE = 0.7971 and MAE = 0.3106 for GA-ANFIS model. Shergarh rainfall station yield better results for the 3- and 5-input combinations with *R*^{2} = 0.75 and 0.78, NRMSE = 0.5105 and 0.4693, NNSE = 0.7923 and 0.8186 and MAE = 0.2825 and 0.2678, respectively. The comparative analysis of the performance assessment of the three models reveals that the prediction accuracy increased with the number of input combinations. An *R*^{2} value between 0.7 and 0.9 indicates a high degree of correlation, a value between 0.5 and 0.7 indicates a moderate degree of correlation and a value between 0.3 and 0.5 indicates a low degree of correlation (Belayneh & Adamowski 2013).

Station . | Statistics . | GA-ANFIS . | PSO-ANFIS . | GRNN . | ||||||
---|---|---|---|---|---|---|---|---|---|---|

2 . | 3 . | 5 . | 2 . | 3 . | 5 . | 2 . | 3 . | 5 . | ||

Jodhpur | NRMSE | 0.5101 | 0.5139 | 0.5321 | 0.529 | 0.5144 | 0.5484 | 0.6208 | 0.6247 | 0.5937 |

NNSE | 0.7926 | 0.7901 | 0.7783 | 0.7804 | 0.7898 | 0.7677 | 0.7207 | 0.7181 | 0.7382 | |

MAE | 0.2842 | 0.2825 | 0.2776 | 0.2986 | 0.2841 | 0.2916 | 0.3248 | 0.3435 | 0.3057 | |

Osian | NRMSE | 0.503 | 0.5158 | 0.5186 | 0.5081 | 0.5141 | 0.5156 | 0.5738 | 0.5703 | 0.6017 |

NNSE | 0.7971 | 0.7889 | 0.787 | 0.7939 | 0.79 | 0.789 | 0.7512 | 0.7535 | 0.733 | |

MAE | 0.3106 | 0.3235 | 0.3008 | 0.321 | 0.3135 | 0.304 | 0.3346 | 0.3429 | 0.3222 | |

Phalodi | NRMSE | 0.5204 | 0.5315 | 0.5045 | 0.532 | 0.5167 | 0.5551 | 0.5601 | 0.5405 | 0.5805 |

NNSE | 0.7859 | 0.7787 | 0.7961 | 0.7784 | 0.7883 | 0.7634 | 0.7601 | 0.7732 | 0.7466 | |

MAE | 0.3566 | 0.3545 | 0.3164 | 0.3617 | 0.3548 | 0.3401 | 0.3578 | 0.3579 | 0.3748 | |

Bilara | NRMSE | 0.6294 | 0.5889 | 0.5845 | 0.6308 | 0.5643 | 0.5897 | 0.68 | 0.6788 | 0.652 |

NNSE | 0.7151 | 0.7424 | 0.7442 | 0.7317 | 0.7574 | 0.7408 | 0.6825 | 0.6833 | 0.7004 | |

MAE | 0.3432 | 0.3178 | 0.2965 | 0.3304 | 0.3166 | 0.3168 | 0.3643 | 0.3688 | 0.3565 | |

Shergarh | NRMSE | 0.5241 | 0.5105 | 0.4693 | 0.5243 | 0.5222 | 0.4932 | 0.5636 | 0.5882 | 0.6328 |

NNSE | 0.7835 | 0.7923 | 0.8186 | 0.7834 | 0.7847 | 0.8034 | 0.7579 | 0.7418 | 0.7128 | |

MAE | 0.3595 | 0.366 | 0.3264 | 0.3534 | 0.3597 | 0.3378 | 0.3702 | 0.3917 | 0.4122 | |

Pipar Road | NRMSE | 0.5795 | 0.5712 | 0.5604 | 0.5786 | 0.5756 | 0.5467 | 0.6352 | 0.6104 | 0.6378 |

NNSE | 0.7474 | 0.7529 | 0.7599 | 0.7481 | 0.75 | 0.7688 | 0.7113 | 0.7274 | 0.7096 | |

MAE | 0.2978 | 0.2989 | 0.2964 | 0.2944 | 0.2925 | 0.2678 | 0.3159 | 0.2976 | 0.3213 |

Station . | Statistics . | GA-ANFIS . | PSO-ANFIS . | GRNN . | ||||||
---|---|---|---|---|---|---|---|---|---|---|

2 . | 3 . | 5 . | 2 . | 3 . | 5 . | 2 . | 3 . | 5 . | ||

Jodhpur | NRMSE | 0.5101 | 0.5139 | 0.5321 | 0.529 | 0.5144 | 0.5484 | 0.6208 | 0.6247 | 0.5937 |

NNSE | 0.7926 | 0.7901 | 0.7783 | 0.7804 | 0.7898 | 0.7677 | 0.7207 | 0.7181 | 0.7382 | |

MAE | 0.2842 | 0.2825 | 0.2776 | 0.2986 | 0.2841 | 0.2916 | 0.3248 | 0.3435 | 0.3057 | |

Osian | NRMSE | 0.503 | 0.5158 | 0.5186 | 0.5081 | 0.5141 | 0.5156 | 0.5738 | 0.5703 | 0.6017 |

NNSE | 0.7971 | 0.7889 | 0.787 | 0.7939 | 0.79 | 0.789 | 0.7512 | 0.7535 | 0.733 | |

MAE | 0.3106 | 0.3235 | 0.3008 | 0.321 | 0.3135 | 0.304 | 0.3346 | 0.3429 | 0.3222 | |

Phalodi | NRMSE | 0.5204 | 0.5315 | 0.5045 | 0.532 | 0.5167 | 0.5551 | 0.5601 | 0.5405 | 0.5805 |

NNSE | 0.7859 | 0.7787 | 0.7961 | 0.7784 | 0.7883 | 0.7634 | 0.7601 | 0.7732 | 0.7466 | |

MAE | 0.3566 | 0.3545 | 0.3164 | 0.3617 | 0.3548 | 0.3401 | 0.3578 | 0.3579 | 0.3748 | |

Bilara | NRMSE | 0.6294 | 0.5889 | 0.5845 | 0.6308 | 0.5643 | 0.5897 | 0.68 | 0.6788 | 0.652 |

NNSE | 0.7151 | 0.7424 | 0.7442 | 0.7317 | 0.7574 | 0.7408 | 0.6825 | 0.6833 | 0.7004 | |

MAE | 0.3432 | 0.3178 | 0.2965 | 0.3304 | 0.3166 | 0.3168 | 0.3643 | 0.3688 | 0.3565 | |

Shergarh | NRMSE | 0.5241 | 0.5105 | 0.4693 | 0.5243 | 0.5222 | 0.4932 | 0.5636 | 0.5882 | 0.6328 |

NNSE | 0.7835 | 0.7923 | 0.8186 | 0.7834 | 0.7847 | 0.8034 | 0.7579 | 0.7418 | 0.7128 | |

MAE | 0.3595 | 0.366 | 0.3264 | 0.3534 | 0.3597 | 0.3378 | 0.3702 | 0.3917 | 0.4122 | |

Pipar Road | NRMSE | 0.5795 | 0.5712 | 0.5604 | 0.5786 | 0.5756 | 0.5467 | 0.6352 | 0.6104 | 0.6378 |

NNSE | 0.7474 | 0.7529 | 0.7599 | 0.7481 | 0.75 | 0.7688 | 0.7113 | 0.7274 | 0.7096 | |

MAE | 0.2978 | 0.2989 | 0.2964 | 0.2944 | 0.2925 | 0.2678 | 0.3159 | 0.2976 | 0.3213 |

A scatter plot for the observed and forecasted drought index using 2-input combinations is shown in Figure 5. Considering the *R*^{2} and the measures presented in Table 4 such as NRMSE, NNSE and MAE, which were obtained based on the observed and forecasted EDI values, it could be observed that for 2-input combination, Osian rainfall station represents a better outcome with the GA-ANFIS model in terms of NRMSE that was 0.503. A lower NRMSE value close to zero indicates that the models have a high degree of accuracy. The results of GA-ANFIS and PSO-ANFIS outcomes are quite identical with each other. In terms of *R*^{2} = 0.75, both the GA-ANFIS and PSO-ANFIS models have a significantly better result for the EDI forecasts at the Osian rainfall station. Also, for Jodhpur, Phalodi and Shergarh stations, both GA-ANFIS and PSO-ANFIS showed findings of *R*^{2} > 0.7 specifying that the forecasts had a high degree of precision in comparison to the GRNN model. For Bilara and Pipar road station, GA-ANFIS and PSO-ANFIS have *R*^{2} < 0.70 indicating a moderate degree of correlation.

Figure 6 shows a scatter plot for the observed and forecasted drought index using 3-input combinations. The three models', GA-ANFIS, PSO-ANFIS and GRNN, performance are shown for the six rainfall stations. Like the outcomes for 2-input combinations, the forecasts for 3-input combinations also exhibited the same trend. In terms of *R*^{2}, NRMSE and NNSE, the GA-ANFIS model produced the best forecasts for the Shergarh station for 3-input combinations, as shown in Table 4. For Jodhpur, Osian, Phalodi and Shergarh stations, the EDI forecasts using 3-input combination performed better using GA-ANFIS and PSO-ANFIS, with *R*^{2} > 0.70 signifying a high degree of correlation. For GA-ANFIS and PSO-ANFIS, the NRMSE results range from 0.5 to 0.6, while for the GRNN it varies from 0.54 to 0.73, showing a poor level of precision for the models of the 3-input combinations. Table 4 displays the outcomes for the six rainfall stations as well as the model configurations.

Figure 7 shows a scatter plot for the observed and forecasted drought index using 5-inputs combination. In comparison to the 2- and 3-input combinations, it is seen that the prediction accuracy has improved for the 5-input combination. For the forecasts of EDI at Shergarh station, GA-ANFIS model had the best results of 0.78, 0.4693 and 0.8186 in terms of *R*^{2}, NRMSE and NNSE, respectively. At some station, the results of GA-ANFIS model are very similar to the results of PSO-ANFIS model. For example, GA-ANFIS and PSO-ANFIS had results of 0.73 for *R*^{2} for forecasts using the 5-input combination at the Osian station, whereas NRMSE had a variation with 0.5158 and 0.5186, respectively.

The findings of this study suggest that GA-ANFIS, PSO-ANFIS and GRNN can be used as a means of forecasting the EDI with different input combinations (Azimi *et al.* 2022) in the arid region. Comparing the results overall, it reveals that the GA-ANFIS and PSO-ANFIS models perform better than the GRNN model in the most of the rainfall stations in the study area. The findings from all the models show that, in all the stations, the correlation between observed and forecasted values as in terms of *R*^{2} grows considerably as the number of forecast inputs increases. With reference to, Osian road rainfall station, GA-ANFIS and PSO-ANFIS model depicts better results for 2-input model combinations, revealing a high degree of correlation for models. The Shergarh rainfall station using the GA-ANFIS model yields better results among all the six rainfall stations for 3- and 5-input model combinations, indicating a high degree of correlation. In terms of *R*^{2}, NRMSE, NNSE and MAE, the prediction results have also revealed that employing the hybrid models, i.e., GA-ANFIS and PSO-ANFIS, is more effective at forecasting the EDI in all rainfall stations.

Overall, in this study, the hybrid models GA-ANFIS and PSO-ANFIS outperformed the standalone model like GRNN in forecasting drought. Hybrid models are developed by combining different machine learning models and prove to be better methods than standalone (Adnan *et al.* 2021). The GA and PSO are high-level problem solving algorithms (Sörensen & Glover 2013) where it is combined with ANFIS to generate high-quality solutions. Hence, several studies recommended using hybrid models that increase the forecast accuracy and hybrid models are becoming more popular in drought forecasting and other studies related to hydrology (Hajirahimi & Khashei 2022).

In terms of correlation, root-mean-square error and the ratio of their variances, a graphical diagram has been designed that might provide a concise statistical evaluation of how closely the patterns resemble one another (Taylor 2001). The Taylor diagram shows the comparative assessment of different models for different input combinations. The Taylor diagram graphically represents and quantifies the different input models' observed behaviour showing the correlation coefficient, root-mean-squared deviation (RMSD) and standard deviation. The Taylor diagram can be utilized to track performance changes in models and to illustrate the relative benefits of various models. It quantifies the degree of correspondence between the modelled and observed performance.

Figures S1–S3 represent the Taylor diagram for all six different rainfall stations with 2-, 3- and 5-input combinations of GA-ANFIS, PSO-ANFIS and GRNN models. From the Taylor diagrams, it may be inferred that for a 2-input combination, GA-ANFIS and PSO-ANFIS models for the Osian rainfall station show better performance by comparing against correlation coefficient, RMSD and Standard deviation. For the 3-input combination, the GA-ANFIS model for the Jodhpur and Shergarh rainfall stations depicts better performance results. Moreover, for the 5-input combination, the GA-ANFIS model for the Shergarh rainfall stations shows better performance results against comparing the different parameters of the Taylor diagram. The higher the correlation and the lesser the NRMSE values, the better the result.

Violin plots are also like box plots, but they are more informative than box plots. The violin plot is a single representation that combines the box plot and density trace (or smoothed histogram) in a synergistic way to reveal dataset (Hintze & Nelson 1998). The violin plots provide a better indication of the shape of the distribution by showing the existence of clusters in the data. The tip of the violin plot at the top represents the maximum data value, the bottom tip represents the minimum data value and it shows the indication of how much data have been accumulated at each point.

## SUMMARY AND CONCLUSIONS

The present study utilized IMD precipitation data from 1964 to 2013 to evaluate the prediction of monthly EDI. Jodhpur being an arid region, the case for forecasting drought using the EDI is relevant. In this study, soft computing techniques like GA-ANFIS, PSO-ANFIS and GRNN have been developed to forecast the EDI drought index in Jodhpur, Rajasthan, India, for six rainfall stations. The outcomes of the monthly EDI evaluation were compared amongst 2-, 3- and 5-input combinations used for developing machine learning algorithms. Based on the PACF analysis of the EDI dataset, the input combination was suggested. To depict the different input combinations, various performance evaluation figures such as scatter plots, Taylor diagrams and violin plots were used. GA-ANFIS, PSO-ANFIS and GRNN were used to compute or forecast the EDI for input combinations of 2, 3 and 5. The study compares the various models and finds that the GA-ANFIS and PSO-ANFIS models which are hybrid machine learning algorithms perform better than the GRNN model. The hybrid machine learning algorithms, GA-ANFIS and PSO-ANFIS can produce drought forecasts with a high prediction accuracy. The research can be extended upon by employing several drought indices as a quantifiable parameter for drought predictions utilizing a variety of machine learning approaches. In addition, raw precipitation, air temperature and large atmospheric circulation indices such as El Nino/La Nina Southern Oscillation (ENSO), Pacific Decadal Oscillation (PDO), etc., could be used as inputs to assess whether using such inputs could increase the performance of machine learning models.

## DATA AVAILABILITY STATEMENT

Data cannot be shared openly but are available on request from authors.

## CONFLICT OF INTEREST

The authors declare there is no conflict.