## Abstract

Simulation and prediction of precipitation time series changes are important for revealing global climate change patterns and understanding surface hydrological processes. However, precipitation is influenced by a variety of factors together, showing the characteristics of nonlinear variation patterns. Given that backpropagation (BP) neural network has a strong mapping ability for nonlinear fitting, we consider using BP neural network for precipitation prediction, then use Sparrow Search Algorithm (SSA) to optimize BP network initial threshold and weight information to improve the efficiency of precipitation prediction. To further enhance model predictive performance, the Markov model is employed to predict the residual series of the SSA-BP model, so as to finally construct a combined SSA-BP-Markov model of precipitation. In this paper, the model is used to simulate the rainfall prediction in Zhengzhou City, Henan Province, China, and to compare and analyze with the other traditional models. The empirical prediction results show that the SSA-BP-Markov model is more accurate and the convergence of the algorithm is better. The model provides a new way of thinking for precipitation prediction and is also useful for predicting precipitation in other regions.

## HIGHLIGHTS

A combined SSA-BP-Markov model is applied to the precipitation prediction, which improves the prediction accuracy.

The decision factors are more comprehensive, including both meteorological data information and rainfall data of previous years.

The SSA method optimizes the parameters of BP network, and Markov model is applied to analyze the residuals, resulting in outputs that are more comparable with real values.

### Graphical Abstract

## INTRODUCTION

Precipitation is a natural process that occurs when water vapor in the atmosphere condenses and falls to the ground in solid or liquid form. It is the result of the combined impacts of geographical region, circulation patterns, climatic system conditions, and other factors. Precipitation is the original source of surface flow and the primary source of groundwater resources; and it is the most fundamental component of the water cycle and the principal source of water for agriculture. Precipitation is a critical indicator for deciding the quantity of available water resources in a region and is also a key factor in the water balance equation. According to the United Nations Food and Agriculture Organization (FAO), more than 80% of the globe's agricultural land depends on natural precipitation, and a decrease in crop yields due to precipitation deficit would directly threaten the basis of human survival (FAO 2020). In addition, the uneven spatial distribution and temporal variability of precipitation are the direct causes of floods, floods, and droughts. With the intensification of human activities, a large amount of greenhouse gas emissions and rising global temperature, droughts and extreme precipitation occur frequently, causing serious impacts on national economic development and people's lives, and even threatening the safety of human lives and properties. In July 2021, Henan Province of China suffered from a historically rare extraordinarily heavy rainfall. In particular, Zhengzhou City, the capital of Henan Province, reached a maximum hourly difference of 201.9 mm on July 20, breaking the historical extremes of hourly rainfall in mainland China and far exceeding the flood control and drainage capacity of urban and rural areas. Large areas of the city's municipalities and rural areas were flooded, with serious flooding in urban streets and depressions, short-lived flooding in rivers and reservoirs, and massive congestion in streams and ditches in hilly areas, forming a particularly significant natural disaster. Therefore, accurate prediction of regional precipitation is of great importance for flood and drought emergency management and agricultural production planning. For China, with economic development and a growing population base, the demand for household and commercial water is increasing and the water demand crisis is becoming more pronounced. Water resources management involves the planning, development and allocation of water resources and is directly related to precipitation prediction. However, due to the diversity of geographical conditions and the variability and complexity of climatic conditions, precipitation is a stochastic system full of uncertainties and fluctuations, so accurate precipitation prediction is a challenging task. This study attempts to predict regional precipitation using historical hydrological data and meteorological information. Accurate prediction of regional precipitation can provide a basis for decision making for rational planning and effective allocation of water resources. Especially in densely populated large cities, accurate precipitation prediction and timely provision of rainstorm warnings allow residents to be prepared in advance, which can largely reduce casualties and property damage. Therefore, this study has great practical significance and application value.

## RESEARCH PROGRESS

For a long time, precipitation prediction has been a hot topic of research by scholars in various countries. Scholars have proposed various prediction methods and models from different research perspectives. In summary, precipitation prediction models can be divided into three categories: causal prediction methods (Auerbach *et al.* 2016; Deng *et al.* 2020; Wang *et al.* 2021a; Wu & Wang 2022a), statistical prediction methods (Velthoen *et al.* 2019), and artificial intelligence methods (Wang *et al.* 2021b; Deng *et al.* 2022a, 2022b; Li *et al.* 2022; Wu *et al.* 2022).

Causal prediction methods are based on the study of the precipitation formation process, identification of the main influencing factors (e.g., temperature, humidity, pressure, wind, etc.), and continuous tracking of the influence of these factors on precipitation formation, followed by the construction of a precipitation prediction model. Surussavadee (2011) used the new generation NCAR/Penn State spatial model to predict 79 storms in a year over Thailand and nearby areas with a resolution of 5 km. The results show that MM5 has good prediction accuracy for ground precipitation rates, maximum vertical winds, and water paths of rain, snow, aragonite, cloud liquid water, and cloud ice, and that it can offer reaching its goal estimations for tropical storm warnings especially within 8 h. Nageswararao *et al.* (2016) developed a model to predict seasonal precipitation in the NWI using empirical data. They established a link between NWI winter rainfall and the overall downwards marine heat transfer at the worldwide ocean surface, thus identifying 15 regions with significant correlation at 90% confidence level during winter. Kisi & Sanikhani (2016) utilized longitude, latitude, period component, and altitude data from 50 locations in Iran as input to the application model and used four different computational methods, namely Adaptive Neuro-Fuzzy Inference System (ANFIS), Subtractive Clustering ANFIS (SC), Artificial Neural Network (ANN), and Support Vector Regression (SVR) to achieve the ability to accurately predict long-term monthly rainfall in the absence of climate data. The comparison results show that the ANFIS-GP model is more successful in predicting long-term monthly precipitation in the region. Zhao *et al.* (2020) demonstrated the inadequacy of using the zenith total delay (ZTD) or precipitable water volume (PWV) of GNSS inversion as a single factor to predict precipitation, and instead proposed an improved rainfall prediction model (IRFM) to do so. The IRFM takes five predictors into account: seasonal PWV/ZTD variations, monthly PWV values, and their first derivatives. The study used five GNSS stations in the Continuous Operating Reference System Platform in Zhejiang Province of China, to conduct a test experiment. The analysis shows that IRFM with five predictors outperforms IRFM with only one predictor or any combination of predictors. In recent years, with the increasing influence of human activities on climate in various regions, the difficulty to identify the factor affecting precipitation has gradually increased. The accuracy of predicting medium- and long-term precipitation by the genesis model is difficult to achieve satisfactory results.

The statistical forecasting method uses a series of statistical tools to analyze nonlinear time series based on historical precipitation data to construct corresponding statistical models for regression forecasting. Hu *et al.* (2018) established a statistical forecasting model for flood precipitation in the Yangtze River's middle and lower reaches using the time-scale decomposition method, and selected predictors for different time-scale components using correlation analysis. Finally, the statistical forecasting model was built using the multivariate linear regression technique. Simulation results show that the statistical prediction model developed using the time-scale decomposition technique has a high utility in precipitation prediction. Javari (2017) used a combined approach of the autoregressive conditional heteroskedasticity (ARCH) family model and the autoregressive integrated moving average (ARIMA) model to predict monthly and annual rainfall series with 140 locations of Iran from 1975 to 2014. The presence of stochastic and nonstochastic phenomena in the rainfall time series was also verified. Ruan *et al.* (2015) developed a regression statistical downscaling model for sea level pressure (SNAO) in Western Europe and sea level temperature (WPT) in the Western Pacific, which showed good performance in fitting the late winter precipitation variability over the whole Southwest Pacific and at each observatory. The model also shows strong robustness in independent validation. Imon *et al.* (2012) used 14 years of daily rainfall data to build a logistic regression model. Then, the model was cross-validated using 2 years of observed daily rainfall data as future data. The results of the study clearly show that the logistic regression model can predict rainfall very effectively if appropriate rainfall predictors are selected. Sintayehu (2018) used statistical method simulations to predict the annual rainfall (1954–2015) time series for the town of Debre Markos, Ethiopia. The empirical results indicate that the ARMA (2,1) model is considered to be a better representation of the annual rainfall values in the region and can accurately predict the values for the next 5 years.

Statistical regression prediction methods give specific functional relationships which means that the process of precipitation change is developed toward regularization and formula. However, the evolution trend of precipitation is often complex and dynamic, showing systematic nonlinearity and nonstationarity, so relying only on the raw data series to predict the future dynamics of precipitation makes the prediction accuracy generally not high. Due to the complexity of the drivers of precipitation variability, it is still difficult to find suitable climate factors or their combinations that characterize precipitation variability. In comparison, the machine learning prediction approach relies on a nonlinear activation function of the decision variables that is specific to the training instances and the target. This approach not only overcomes the shortcomings of function block expressions but also allows for the selection of a relatively optimal network for predicting target values via the learning and training process (Li *et al.* 2021). Currently, nonlinear intelligent computing technology approach, such as neural networks, genetic algorithms, and deep learning, is used in atmospheric disciplines to achieve good applications and research results in modeling studies of monthly and seasonal precipitation forecasts (Zhu & Wu 2013; Salih *et al.* 2020; Sun *et al.* 2021; Ksmhi *et al.* 2022).

In an effort to validate the short-term climate prediction of monthly average rainfall in Guangxi, China, Jin *et al.* (2015) created a model for nonlinear statistical ensemble forecasting using neural networks and particle swarm optimization (PSO) methods. The results demonstrate that the integrated prediction model of PSO neural network can be a better tool for making short-term climate predictions than the traditional linear statistical method, which is a strong indicator of the feasibility of such a model.

Ganti's study used a feed-forward multilayer perceptron-type neural network structure that was trained through BP (Ganti 2014). In this study, the data for north-central India include monthly average rainfall and monthly average daily maximum temperature. The prediction performance of the neural network model was evaluated using a variety of conventional statistical parameters and scatter plots, and the algorithm properly forecasted monsoon rainfall.

Based upon previous monthly rainfall and climate indices, He *et al.* (2015) combined the combined mutual information (MI), multi-resolution analysis (MRA), and PSO into a hybrid wavelet neural network model (HWNN) to predict monthly rainfall. Results from empirical studies show that the HWNN model is more accurate at predicting precipitation for Australia than the reference model, and the improvement is significantly greater for stations located in the southeastern part of Australia's inland regions and in Western Australia.

Annas & Arisandi (2017) applied two methods, multivariate transfer function and artificial elastic backpropagation neural network (RProp), to rainfall forecasting in Makassar district, South Sulawesi province, Indonesia. First, the multivariate transfer function is used to predict the rainfall caused by several variables such as air humidity, air temperature, air pressure, and wind speed, and then these variables are used as the input values of the RRrop method for training and prediction. The error between the actual data and the predicted data obtained by this method is substantially reduced.

Yang *et al.* (2018) proposed a method for forecasting typhoon rainfall using a support vector machine (SVM) and a fast elite nondominated Sorting genetic algorithm II (NSGA-II). A rainfall forecasting model for 1–6 h advance forecasting is created using a SVM and four alternative combinations of input parameters (i.e., prior rainfall, typhoon characteristics, and local meteorological conditions). The suggested forecaster selection method's superiority is demonstrated by utilizing three rainfall stations in northeastern Taiwan's Yilan River basin.

Wu & Xie (2019) proposed a parallel collaborative support vector regression evolutionary algorithm (SVRGAPSO) based on a Genetic Algorithm (GA) and PSO hybrid. An evolutionary algorithm that utilizes a parallel process iterates two GA or PSO populations concurrently and utilizes information exchanges between the two populations to overcome the local optimum's early termination. The empirical results also reveal that the SVRGAPSO algorithm is capable of better generalizing precipitation forecasts with minimal forecast errors. Improved rainfall forecast accuracy can thus be achieved.

Zarei *et al.* (2021) derived ensemble precipitation forecasts from six numerical models included in the THORPEX Interactive Grand Global Ensembles (TIGGE) database and corrected for bias using quantile mapping and random forest (RF) methods. The increased performance of the result-validating models demonstrates the method's utility in actual applications.

Aksoy & Dahamsheh (2018) presented feed-forward-back propagation algorithms using radial basis functions and generalized regression artificial neural networks (ANNs) coupled with Markov chains (MCs) that could precisely predict the percentage of drought months, removing any nonphysical factors that produce negative precipitation and improving the ability to predict monthly precipitation in drought-affected areas.

In recent years, wavelet neural networks, which combine wavelet analysis and neural networks, have become more popular for predicting precipitation. Wu *et al.* (2021b) introduced a new hybrid approach of monthly precipitation time series wavelet ARIMA LSTM (W-AL). The results indicate that W-AL forecast accuracy is superior to ARIMA and LSTM in monthly precipitation prediction.

ANNs are used to forecast rainfall mainly considering the influence of two types of factors, one is to use temperature, pressure, wind speed, humidity, and other related meteorological data for prediction, and the other is to use the rainfall in previous years to predict the rainfall in future years. However, the randomness and complexity of medium- and long-term precipitation can greatly reduce the convergence speed and prediction accuracy of neural networks, and even cause nonconvergence, which leads to unstable prediction results and poor prediction accuracy. Therefore, the single machine learning prediction methods have certain limitations, and they are difficult to make accurate predictions for complex nonlinear precipitation sequences. Additionally, as artificial intelligence and deep learning continue to advance, BP artificial neural networks are increasingly being applied in the field of precipitation prediction due to their superior nonlinear mapping and self-adaptive capability (Kumar *et al.* 2019; Liu *et al.* 2019; Wang *et al.* 2019, 2022; Wu & Wang 2022b). However, because the initial weights and thresholds of BP neural networks are typically determined by human subjective awareness and experience, this has a direct effect on the convergence effect of BP neural networks, hence reducing precipitation prediction accuracy. Therefore, a variety of methods for optimizing the parameters of BP neural networks have been derived, and the common ones are GA, PSO, and so on. However, the GA has the defects of more complicated coding and slower solution speed; the PSO is easily taken down to the local minimum, and the search accuracy is not high. The Sparrow Search Algorithm (SSA) proposed in 2020 is a population intelligence algorithm using the foraging behavior of sparrows, which mainly has the advantages of jumping out of local minima, avoiding local optimum, and setting fewer parameters.

In this study, an integrated research approach was used to develop a BP neural network model for predicting precipitation in the current period using current period meteorological data and precipitation in the same period of the previous 2 years as input variables. Due to the more comprehensive and highly correlated factors considered, there will be a better improvement on the accuracy of the prediction. Then, by determining the ideal weights and thresholds for the neural network, SSA is utilized to improve the algorithm's convergence efficiency. Furthermore, the Markov prediction method can effectively predict the data with high volatility, so the model residuals are corrected by Marks, and the final combined model will have higher prediction accuracy. The advantages of the combined model are including:

- (1)
The decision factors are more comprehensive, including both meteorological data information and rainfall data of previous years. The comprehensiveness of the data can guarantee the accuracy of the calculation results.

- (2)
The SSA method optimizes the parameters of the BP neural network, resulting in outputs that are more comparable with the real values.

- (3)
Markov model is applied to analyze the residuals, which makes the fitting effect more outstanding.

## MODELS AND METHODS

### BP neural network

BP neural network, which is one of the most widely adopted neural network architectures, is a multilayer feed-forward neural network trained using a nonlinear dynamic error back propagation algorithm. A BP neural network is typically composed of three layers: an input layer (multiple input values), an implicit layer, and an output layer (multiple output values), where the number of layers and the number of neurons in the implicit layer can be chosen according to the problem. The structure is shown in Figure 1.

Where denotes the input of the *i*-th neuron in the input layer, ; denotes the weight from the *i*-th neuron in the input layer to the *j*-th neuron in the hidden layer. denotes the bias of the *j*-th neuron in the hidden layer; denotes the activation function of the hidden layer; denotes the weight between the *j*-th neuron in the hidden layer and the *k*-th neuron in the output layer; denotes the bias of the *k*-th neuron in the output layer, . denotes the output layer activation function. The output of the *k*-th neuron of the output layer is denoted by *O _{k}*.

With a BP neural network, the input layer accepts inputs from the outside world, and the output layer produces results after the network has been processed. The network contains three kinds of operational processes: the forward propagation process from the input layer to the output layer via the implicit layer; the reverse correction process from the output layer to the input layer for the error between the predicted output; and the actual value of the network and the training process of alternating the forward and reverse processes. By continuously calculating the changes in the network weights and deviations along the gradient of the relative error function, this algorithm approaches the target gradually.

The BP neural network algorithm steps are as follows.

- (1)
Input the pre-processed sample data into the neural network as

*x*_{1},*x*_{2}, …,*x*._{n} - (2)
- (3)
- (4)
- (5)
- (6)Back propagation of error. In order to achieve optimal performance, the output error of neurons in each layer is calculated layer by layer starting from the output layer, and the weights and biases of each layer are adjusted so that the resulting output of the modified neural network is in close proximity to the desired value. For the overall
*N*samples, the error function*E*(*w*) is defined as:where*Y*denotes the expected value of the_{p}*p*-th sample, denotes the actual output value of the*p*-th sample,*w*denotes the vector of biases and weights of the network, and*e*(_{p}*w*) denotes the error. - (7)
Determine whether

*E*(*w*) is less than the error*ε*, and end the training if the condition is satisfied. If not, evaluate whether the needed number of training sessions has been met and, if so, terminate the training; if not, return to step (2) and repeat the experiment until the condition is met.

BP neural networks are commonly used in the areas of signal recognition and system optimization due to their strong features of self-learning, self-adaptation, and fault-tolerant design. However, it is also a fact that the traditional BP network displays the defects of slow convergence, achieving a local optimum, and the inability to conduct global search, which leads to the prediction accuracy of BP neural networks not meeting the demands in prediction applications.

### Sparrow optimization algorithm

SSA is a novel population intelligence optimization algorithm proposed by Xue & Shen (2020), which is mainly based on the population foraging behavior and anti-predator behavior of sparrows. The sparrow population can be classified into two categories based on its foraging behavior: discoverers (explorers) and joiners (followers). The proportion of both in the group is fixed and the roles of discoverers and joiners can be switched between each other. Typically, discoverers are the healthier individuals in the population and take on the task of exploring food sources for the population members. The joiner will closely monitor the finder and actively grab food to increase its health value. When a natural predator is found, an alarm signal is used to alert the colony members to avoid danger. When the alarm value is greater than the safety threshold, the population will migrate to other locations. Furthermore, the discoverer is responsible for providing foraging directions for the whole flock, as well as finding food for the entire population. Thus, discoverers are able to access a greater range of foraging options than joiners.

- (1)During the iteration process, the discoverers’ positions are updated in accordance with Equation (6)where
*t*denotes the population's current number of iterations and iter_{max}denotes the population's maximum number of iterations. denotes the information of the*j*-th dimension of the*i*-th sparrow during the*t*-th generation iteration. is a random number, and .*Q*is a random number that follows a normal distribution.*L*is a dimensional matrix with all elements of the matrix being 1. ST and*R*_{2}are the safety and warning values, respectively, and , . indicates that the environment is safe at this time and the finders can explore their surroundings. indicates that there are predators around the environment, some of the finders send alarm messages, and all population members move to a safe area. - (2)In the process of searching for food, joiners can always find a finder with better foraging environment and then get food from it or forage around it. Some other joiners will keep monitoring the discoverer and then go to grab food in order to obtain food more easily and increase their predation rate. If it wins, its position is iterated as in Equation (2); if it loses, the worse the foraging position it is in. So in order to get a better foraging position, some followers may leave and fly to other places. The position update of joiners (followers) is described in Equation (7) as follows.where represents the optimal position occupied by the current searcher; represents the worst position in the foraging area.
*A*is a dimensional matrix where every element value is a random 1 or −1 and . If , it indicates that the*i*-th follower having lower fitness value did not get a better foraging position and was hungry and needed to move to other areas to get more energy. - (3)When an individual in the sparrow population detects danger, it alerts. If the alarm value exceeds the safety value, the explorer leads its following to a safe area for foraging. In this case, the sparrow population will engage in anti-predatory behavior. Individuals at the edge of the sparrow population will move to a safe location, and individuals in the center of the sparrow population will follow and move closer to other individuals. The equation for updating the hazard warning mechanism is shown as Equation (8).where denotes the optimal location of the foraging area. The step control parameter is represented by , which is a normally distributed random number with a mean of 0 and variance of 1. is a random number that represents the direction of sparrow movement and is also a step control parameter;
*f*_{i}represents the current individual sparrow fitness value; variables*f*_{g}and*f*_{w}denote the global best and worst fitness values, respectively. When*f*_{i}>*f*_{g}, sparrows are on the periphery of the population and therefore more exposed to predators. When*f*_{i}=*f*_{g}, the sparrow in the center of the population detects danger and needs to migrate closer to other sparrows to avoid being predated.

### Markov residual correction model

The Markov model is used the theory of stochastic processes and provides a new method of predicting sequences with large fluctuations. The model is a mathematical technique for forecasting the future state of things by examining the probability of their beginning states as well as their transfer probabilities. The transfer probability is a measure of the amount of effect random factors have on the prediction sequence, accounting for the mistake of increasing the prediction of random variables when the prediction sequence satisfies the smooth Markov chain's equilibrium characteristics.

*x*denotes the state probability vector at the initial moment,

_{k}*x*represents the state probability vector after

_{k+n}*n*moments, and

*P*represents the probability matrix associated with a one-step state transfer. The Markov model is composed of the following methodological steps.

- (1)
Calculate the deviation of the original model prediction data from the real data. According to the deviation sample mean-squared deviation grouping method, the grading criteria are established so as to determine the state space of Markov chain. Assuming that

*m*possible states, i.e.,*E*_{1},*E*_{2}, …,*E*, are obtained, the_{m}*i*-th state interval*E*= [_{i}*L*,_{i}*R*] (_{i}*i*= 1, 2, …,*m*), where*L*and_{i}*R*denote the upper and lower limitations of the state, respectively._{i} - (2)Test for Markov. The test statistic is calculated by Equation (10).

From probability theory, it is known that the statistic obeys the distribution with degree of freedom . If the statistic is considered to have Markov for the series, only then can the Markov model be used for residual series correction prediction.

- (3)Calculate the autocorrelation coefficients of each order of the serial index values as Equation (11).

- (4)
Construction of the state transfer matrix. The probability of transferring from

*E*to_{i}*E*after one step is the state transfer probability_{i}*P*. In general, the state transfer probability is derived by the principle that frequency approximation equals probability, i.e., , where_{ij}*M*is the total number of times state_{i}*E*occurs, and_{i}*M*means the number of times which_{ij}*E*is transferred to state_{i}*E*. The transfer probability matrix_{j}*P*=_{k}*P*_{1}of^{k}*k*steps is constructed based on the state values of the data. - (5)
Using the state transfer probability matrix to combine the random variables from previous periods, it can calculate the prediction probability of the current state by weighting the weighted sums of each prediction probability in the same state. It is then decided which state has the greatest probability to become its final state interval.

- (6)
Calculate the Markov prediction value. Initially, the residual prediction value of the original prediction is , and the interval of variation is in [

*L*,_{k}*R*]. A middle value from this interval could be used to determine the final result of the model, i.e., the final prediction value is ._{k}

### Algorithm flowchart

The BP neural network model's initial weights and thresholds are chosen at random. The entire model will alter the weights continuously during the error back propagation process to discover the ideal weights and thresholds, but because it is simple to fall into the local optimum, the initial weights and thresholds have a considerable effect on the overall accuracy. In order to avoid the BP neural network model from approaching a local optimal, the SSA algorithm is used to optimize the model. The advantage of the SSA algorithm is that it can maintain high performance in different search spaces, thus effectively avoiding the local optimum problem. The SSA algorithm utilizes the weights and thresholds of the BP neural network and assigns the best result to the BP neural network model in order to optimize the BP neural network model to produce the SSA-BP model. Thereafter, the residual series of SSA-BP prediction values are predicted using Markov model, and finally, the two prediction values are summed to obtain the final precipitation prediction value. The algorithm's flowchart is depicted in Figure 2, and the key steps are as follows.

- (1)
Data pre-processing, including interpolation of missing values and normalization of data.

- (2)
Determine the BP neural network topology. This contains the number of neurons and layers in the input layer, output layer, and hidden layer, and determines the essential parameters.

- (3)
Initialize the main parameters of the SSA algorithm, including the number of populations, the maximum number of iterations, the alarm threshold, the safety threshold, etc.

- (4)
Use the initial weights and thresholds of the BP as the search targets of the SSA algorithm, execute the SSA algorithm, and calculate the fitness values to determine the optimal and global worst positions of the explorer.

- (5)
The position of the sparrow is updated using Equations (6)–(8).

- (6)
Calculate the fitness value again using the revised sparrow position, and then update the explorer's best and worst global positions.

- (7)
Repeat steps (5) and (6) until the halting condition is met, and then use the result as the initial weights and thresholds for the BP neural network.

- (8)
Train the BP neural network with the determined topology, continue to adjust the network parameters using error back propagation, iterate the training until a preset error accuracy or number of iterations is reached, and output the predicted values.

- (9)
Establish the residual series using the errors of the real and predicted values of the training set, divide the state spaces of different MCs according to the criteria, and check whether they satisfy the Markov.

- (10)
Calculate the autocorrelation coefficients of each order of the residual series index values, and normalize them to obtain the corresponding Markov chain weights.

- (11)
Construct the residual state transfer matrix and calculate the corresponding state transfer probabilities to calculate the predicted values of the residual series.

- (12)
The predicted values obtained from SSA-BP and the Markov chain prediction residuals are summed to obtain the final predicted values of precipitation.

In the above equations, *N* denotes the total data volume, denotes the true value, and denotes the predicted value.

## MODEL VALIDATION CALCULATIONS

### Study area overview and data sources

The city of Zhengzhou is the capital of Henan Province, China. It is situated in central China, in the lower Yellow River valley, within the Central Plains’ hinterland, and northeast of the province's center. In the west, there is a high terrain, while in the east, a low terrain prevails. This region has a north temperate continental monsoon climate with four distinct seasons. A long and dry winter occurs with little precipitation and snow; a dry spring follows with little precipitation and many spring droughts, with cold and warm winds in abundance; a hot summer occurs with a high concentration of precipitation; a cool autumn occurs. Annually, the temperature averages 15.6 °C, sunshine hours average 1,869.7 h, and rainfall is an average of 542.15 mm. Zhengzhou is a significant core city and megacity in central China, as well as a vital national integrated transport network. By 2021, the metropolis will consist of six districts, one county, and five county-level cities, covering an area of 7,567 square kilometers and housing 12 million people. The per capita water resources are 179 cubic meters, which is a serious water shortage area. The geographical area map of Zhengzhou City is shown in Figure 3.

In this paper, 764 sets of monthly precipitation data from October 1957 to May 2021 in Zhengzhou City are selected as the research object, and the training and test set are assigned according to the ratio of 9:1.

### Data processing

Due to sensor failures and other reasons, some of the original data have missing values and abnormal values. If the original data is directly fed into the model, it will have an effect on the model's forecast accuracy. Considering the small impact of some data on the overall data, these missing values and outliers are interpolated.

*et al.*2021a). After normalization, each variable is in the same range, which is beneficial for the model to discover the relationship between each variable, accelerate the model's convergence, and shorten the training period of the model. The formula for data normalization is shown in Equation (17).

After normalization of the data, the ranges of variation are all located at [0,1].

### SSA-BP-Markov model

The input variables of the BP neural network model include meteorological data including air pressure, wind speed, humidity, temperature, water pressure, sunshine duration, and precipitation for the same period of the previous 2 years, and the output variable is precipitation for the current month. Therefore, the neurons’ number in the input layer of the neural network is 8, the neurons’ number in the output layer is 1, and the neurons’ number *L* in the hidden layer is usually obtained using the empirical formula: , where *m* denotes the neurons’ number in the input layer, *n* denotes the neurons’ number in the output layer, and *l* denotes an arbitrary integer between 1 and 10. So the neurons’ number in the hidden layer can be calculated as an arbitrary integer between 4 and 13. To ensure the best possible training effects of the network, different solutions of the neurons’ number in the hidden layer are compared in order to optimize the number of neurons in the hidden layer. Because the experimental data indicate that the model performs best when *L* = 10, the network topology is set to 8-10-1 (Figure 4).

Sparrow search is used to identify the ideal weights and thresholds for the BP neural network. Based on the number of neurons in the input, implicit, and output layers, a BP neural network can be constructed whose initial weights and thresholds are based on each individual in the population. The initial coding length would then be determined by calculating the neurons’ number in the input, implicit, and output layers. Since the structure of the BP neural network set up in this study is 8-10-1, i.e., there are 8 nodes in the input layer, 10 nodes in the implicit layer, and 1 node in the output layer. Therefore, there are 8*10 + 10*1 = 90 weights (edges) and 10 + 1 = 11 queues, so the initial coding length is 90 + 11 = 101. The sparrow search algorithm's parameters are as follows: The number of populations is 30, the maximum number of iterations is 50, the number of discoverers and scouts is 20% of the total population size, the initial position in the population is produced randomly, and the warning value is *R*_{2} = 0.8. During the iterative process, the individuals with the greatest fitness values would be used as discoverers. These discoverers would have priority when obtaining food during the search process and provide directions to the entire population. The discoverer position is updated using Equation (6) according to the warning value *R*_{2} and the safety value ST. Joiners perform surveillance and compete with discoverers, using Equation (7) to update the position. *f*_{i} is the fitness value of the current individual sparrow and *f*_{g} is the optimal fitness value of the current sparrow population. Based on the comparison result of *f*_{i} and *f*_{g}, the position of the early warning person is updated using Equation (8). The initial weight threshold is used to train the neural network, and the absolute value of the error between the predicted and actual output values is summed up as the fitness function, and the smaller the fitness value is, the smaller the error is. The activation function of the BP neural network is *Sigmoid* function, and the training method is *Trainlm*. The number of training sessions is set to 1,000, the learning rate is 0.01, and the minimum error of the training target is 0.0001.

The change of fitness during the optimization of the sparrow optimization algorithm is shown in Figure 5. After 20 iterations of updates, the overall decrease in the fitness value is obvious, indicating that the SSA plays a significant optimization role. In particular, when the number of iterations reaches 23, a very small fitness value can be reached, indicating that the SSA can find the initial optimal weights and thresholds of the neural network with only a small cost, laying a good foundation for the subsequent training and testing of the BP neural network. In order to compare the optimization search efficiency of different algorithms, we also select the change curve of the adaptation degree after optimization of GA and PSO algorithms. From Figure 5, we can find that SSA has higher convergence efficiency and merit-seeking effect compared with other optimization algorithms. The weights and thresholds optimized by SSA are input to the BP neural network, and the training set data are used for continuous training and reverse iterative correction to obtain the initial prediction value of precipitation in Zhengzhou City.

*F*

_{1}= (

*f*) is obtained, and the element

_{ij}*f*indicates the frequency (i.e., number of times) of shifting from state

_{ij}*i*to state

*j*in the state sequence.

From Equation (10), the statistic is 257.4587. Given a significance level , . Since , the residual series satisfies martingale.

The autocorrelation coefficients and weights for each order of the residual series computed from Equations (11) and (12) are shown in Table 1.

. | . | . | . | . |
---|---|---|---|---|

−0.0308 | −0.0245 | 0.0254 | 0.0084 | −0.0078 |

0.3176 | 0.2525 | 0.2623 | 0.0868 | 0.0808 |

. | . | . | . | . |
---|---|---|---|---|

−0.0308 | −0.0245 | 0.0254 | 0.0084 | −0.0078 |

0.3176 | 0.2525 | 0.2623 | 0.0868 | 0.0808 |

Based on the historical data of the residual series, the prediction of the residual series of the test set is performed by combining the transfer probability matrix and the state division table. The predicted values of the residuals are added with the SSA-BP predicted values to finally obtain the predicted values of the combined model. The comparison of this predicted value and the predicted and true values of other algorithms is shown in Figure 6.

As illustrated in Figure 6, the overall trend of the BP prediction model's predicted values can follow the fluctuation trend of the genuine values to a degree. The overall trend of the PSO-BP prediction model does not differ much from the real value, but there is a certain lag time with the real data, and the overall fluctuation is large. The GA-BP model has individual prediction values with large deviations from the real ones, especially there will be negative predictions that do not match with the real situation. The SSA-BP prediction model has some differences between the middle part of the trend and the actual trend, but the overall fluctuation is small, and the variation between the predicted and real values for each group is small, and the first and last part of the predicted values are extremely near to the real values. The predicted data after Markov correction are closer to the true values in most cases, but the results of the correction are not particularly satisfactory at the extreme points of individual jumps.

The accuracy metrics comparing different algorithms are shown in Table 2. According to the results of several indexes used to evaluate the models’ predictive accuracy, the SSA-BP-Markov model suggested in this study has a greater predictive accuracy. Figure 7 further illustrates the improvement and advantages of the SSA-BP-Markov model presented in this paper over existing models for precipitation prediction. Among them, SSA-BP also performs well in terms of MSE evaluation index, indicating that SSA has a better ability to search to the optimal hyperparameters of BP; however, SSA-SVM indicates that it is not satisfactory, also indicating that there is a great difference between the same optimization-seeking algorithm for different neural network hyperparameters. From the analysis of MAE evaluation index, the performance variation of different algorithms is not obvious, among which SSA-BP-Markov and PSO-BP perform relatively well, indicating that the BP model is more suitable for precipitation prediction. From the analysis of MAPE index, GA-SVM and PSO-SVM models have insufficient prediction ability, while the performance of other models has little difference, which indicates the limitation of SVM model in applying to precipitation prediction in Zhengzhou city. Finally, in the RMSE index, the SSA-SVM model performs as the most inadequate, and the SSA-BP-Markov and SSA-BP perform slightly better. In conclusion, from a comprehensive view of each evaluation index, the accuracy of different models for precipitation prediction in Zhengzhou city is shown as follows: SSA-BP-Markov > PSO-BP > SSA-BP > LISVM > GA-SVM > GA-BP > PSO-SVM > SSA-SVM.

. | MSE . | MAE . | MAPE . | RMSE . |
---|---|---|---|---|

SSA-BP-Markov | 1,161.5279 | 24.9026 | 0.5039 | 34.0812 |

SSA-BP | 1,201.5868 | 26.9177 | 0.5631 | 34.6639 |

GA-BP | 1,987.4317 | 31.8635 | 0.6748 | 44.5806 |

PSO-BP | 1,360.9190 | 25.3282 | 0.5167 | 36.8906 |

LIBSVM | 1,611.8332 | 27.3494 | 0.5477 | 40.1476 |

SSA-SVM | 2,588.9527 | 33.9834 | 0.5630 | 50.8778 |

GA-SVM | 1,415.8061 | 26.9186 | 0.7328 | 37.6272 |

PSO-SVM | 1,891.4545 | 32.5582 | 0.6887 | 43.4909 |

ARIMA | 5,336.478 | 73.0512 | 0.8652 | 50.1663 |

. | MSE . | MAE . | MAPE . | RMSE . |
---|---|---|---|---|

SSA-BP-Markov | 1,161.5279 | 24.9026 | 0.5039 | 34.0812 |

SSA-BP | 1,201.5868 | 26.9177 | 0.5631 | 34.6639 |

GA-BP | 1,987.4317 | 31.8635 | 0.6748 | 44.5806 |

PSO-BP | 1,360.9190 | 25.3282 | 0.5167 | 36.8906 |

LIBSVM | 1,611.8332 | 27.3494 | 0.5477 | 40.1476 |

SSA-SVM | 2,588.9527 | 33.9834 | 0.5630 | 50.8778 |

GA-SVM | 1,415.8061 | 26.9186 | 0.7328 | 37.6272 |

PSO-SVM | 1,891.4545 | 32.5582 | 0.6887 | 43.4909 |

ARIMA | 5,336.478 | 73.0512 | 0.8652 | 50.1663 |

*Note:* The MAPE indicator is calculated to avoid taking infinite values and discarding data where the true precipitation is zero.

## CONCLUSION

The SSA is introduced in this study for optimizing the weights and thresholds of the BP neural network for precipitation prediction in Zhengzhou City based on the initial weights and thresholds that affect the BP neural network's accuracy. The model residuals were then corrected using Markov model to obtain the final prediction of precipitation. The Markov model brings the predicted values of the fluctuating residual series within a reasonable range. Additionally, the empirical results demonstrate that the SSA-BP-Markov model's final forecast accuracy is significantly greater than that of classic GA-BP, PSO-BP, SVM, and other prediction models, indicating that it is accurate and feasible for regional precipitation prediction. Since the input variables considered in the model include both meteorological factors and the same historical data, the comprehensiveness of the influencing factors considered also contributes to the improvement of the prediction accuracy of the model.

With global climate change and changing forms of the water cycle, extreme weather events such as droughts and floods caused by severe uneven precipitation are increasing in frequency and intensity, which puts great pressure on water resources management especially in cities with high population and socio-economic density. On the one hand, precipitation is one of the most important sources of water supply in cities, and cities need to collect and effectively use it. On the other hand, cities also need to always deal with the damage caused by extreme precipitation to urban order. Therefore, accurate precipitation forecasting can provide effective data support for city managers to make targeted and forward-looking water resources planning plans to mitigate urban water crisis and cope with the harmful effects of climate change on urban management.

However, precipitation processes are complex dynamic systems, and there are many unknown scenarios for the analysis of their derived changes (Fernández-Alvarez *et al.* 2020). For this reason, the selection of variables in the model needs further discussion. In addition, with the development of deep learning, some newer network models applied to precipitation prediction is one of the next research directions. We also believe that with the deeper understanding of precipitation processes and the emergence of new models, the prediction of precipitation will also become more accurate in the future, thus better guiding water resources planning and preventing floods for the benefit of human society.

## ACKNOWLEDGEMENTS

This work was supported by the Open Research Fund of State Key Laboratory of Simulation and Regulation of Water Cycle in River Basin, China Institute of Water Resources and Hydropower Research (grant No. IWHR-SKL-201905).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICTS OF INTEREST STATEMENT

The authors declare there is no conflict.

## REFERENCES

**25**(4), 510–522.