## Abstract

An algorithm, named long short-term memory (LSTM)-logistic chaos mapping chicken swarm algorithm (LCCSA), is proposed for initializing the weights and thresholds of LSTM neural networks using the Logistic chaotic mapping chicken swarm algorithm (CSA). This algorithm aims to improve mid- to long-term runoff sequence prediction in river basins. In this model, the logistic chaotic mapping method is used to initialize the chicken swarm, and LCCSA is employed to pre-train the weights and thresholds of each layer of LSTM 50 times, using the training results' initial weights of LSTM to enhance convergence accuracy and speed. Taking the Manas river and Kuitun river, two typical basins in northern Xinjiang, China, as the research objects, LSTM-LCCSA was used to forecast the mean monthly runoff in the mid- to long-term under different lag time series by using the runoff evolution data within a certain period. The example using the basin located in northern Xinjiang demonstrates the effectiveness, stability, and generality of the LSTM-LCCSA method in mid- to long-term prediction of average monthly runoff, and the prediction accuracy and universality of LSTM-LCCSA are better than other data-driven models.

## HIGHLIGHTS

Proposed an LSTM-LCCSA algorithm to address mid- to long-term runoff sequence prediction in river basins.

Initialization of the chicken swarm using a logistic chaotic mapping method improved population diversity and avoided CSA falling into a local extreme value.

The intercomparison of data-driven models (CM-SVM, GA-SAA, MRA, GA-FFNN) with respect to prediction accuracy and stability.

## INTRODUCTION

A runoff sequence is a type of high-dimensional nonlinear system that is influenced by many factors. Since the 20th century, river runoff has changed owing to the influence of urbanization development and climate change, resulting in increased uncertainty in predicting runoff (Chau & Jiang 2002). High-precision river runoff prediction can provide valuable hydrological information for downstream reservoir discharge and flood prediction (He *et al.* 2011; Valipour 2015; Jiang *et al.* 2018). In recent years, numerous scholars have used data-driven models to predict mid- to long-term runoff time series, and many notable research results have been obtained (Liu *et al.* 2019; Ren *et al.* 2022).

For runoff series prediction, the prediction accuracy declines as the prediction period is extended. Extending the prediction period while improving prediction accuracy is the primary focus of contemporary mid- to long-term runoff prediction research. In a study on extending the prediction period, Abrahart & See (2000) discussed the use of artificial neural networks (ANN) to extend the prediction period very early. As machine learning and deep learning theory have improved, the prediction capability of neural network models in the mid- to long-term prediction of hydrology has improved as well. For prediction using machine learning and deep learning technology, the support vector machine (SVM) has been used by many scholars in flood prediction (Bafitlhile & Li 2019), monthly average runoff, and inflow prediction of dam reservoirs (Kisi 2008; Awchi 2014; Ghorbani *et al.* 2016; Karami *et al.* 2017; Babaei *et al.* 2019). These studies have demonstrated that SVM exhibits strong capabilities regarding runoff prediction. Furthermore, a variety of neural network models have been used to conduct comparative studies of daily runoff prediction (Kisi 2009a, 2009b, 2009c; Nacar *et al.* 2018; Modaresi *et al.* 2018), and these studies have demonstrated that various neural network models achieve different prediction results owing to different fitting capabilities. In general, neural network models fit nonlinear problems well. In addition, Bayesian regression (BR) and the adaptive network-based fuzzy inference system (ANFIS) have been used for runoff prediction (Hamaamin *et al.* 2016), and the prediction results indicate that the performances of BR and ANFIS are satisfactory at the global level; their reported Nash–Sutcliffe efficiencies are 0.99 and 0.97, respectively.

To improve the prediction accuracy after extending the prediction period, Delafrouz *et al.* (2018) proposed a new spatial reconstruction technique hybrid neural network (GEPNN) to predict the daily runoff in three river basins: the Mississippi, Pearl River, and Muskingum River in the USA. Combining phase space hybrid model reconstruction with a hybrid neural network model, the daily runoff of river measurement stations involving the three rivers was predicted. Hybrid neural networks are considered to have higher predictive power than pure artificial neural networks. For example, a phase space hybrid model based on chaotic dynamics can yield better prediction results than an ANN. Heuristic algorithms have been evaluated to analyze the runoff evolution law of time series with uncertainty in a river basin and provide new ideas for non-stationary processes treatment in hydrological series (Venzon *et al.* 2018; Wang *et al.* 2018; Yang *et al.* 2018). Using rainfall and climate information as input, two stations in the Dongjiang River Basin in Southern China were selected as the research objects, and a wavelet neural network and support vector hybrid model (BWS) were used to predict river runoff (Liu *et al.* 2015). The results indicate that BWS can provide detailed information regarding prediction uncertainty, which has provided further inspiration to improve the accuracy of extreme runoff evolution prediction. To analyze non-stationary hydrological series, Latifoglu *et al.* (2015) combined spectral analysis and an ANN to achieve better prediction results than by using a single algorithm. They discussed the importance of hybrid models in predicting non-stationary hydrological series as well. Although the models improved the prediction accuracy to a certain extent, the increasing complexity of human activities and the non-stationary nature and randomness of runoff series have increased. As the forecast cycle gets longer over time in a real system, which increases the need for models that can skilfully adapt to changes in runoff, failure to do so would result in decreased predictive accuracy. Human activity, shifting climate patterns, and changing terrain all directly impact runoff, rendering the forecast yet more complex. Hence, enhancing forecast precision and broadening the prediction period are significant for runoff prediction.

Therefore, fully analyzing the relationship among runoff time, factors of climate, topography, and other factors in the basin, and revealing the evolution rules of a series would be important for the prediction period extending and prediction accuracy increasing. In this paper, we used the logistic chaotic mapping chicken swarm algorithm (CSA) to pre-train the initial weights and thresholds of LSTM and propose the LSTM-logistic chaos mapping chicken swarm algorithm (LCCSA) algorithm. To the best of our knowledge, this is the first study to apply the LSTM-LCCSA algorithm to mid- to long-term prediction of hydrological runoff series. The prediction results on the hydrological sequence of a river basin in northern Xinjiang have been used to verify the effectiveness, stability, and generality of the LSTM-LCCSA algorithm in predicting mid- to long-term runoff series.

## PRELIMINARIES

### Study data

^{4}km

^{2}. The Kuitun river basin is located between N43°30′–45°04′ and E83°22′–85°47′. The main concourse has a total length of 360 km and is composed of three rivers. The basin area spans 2.8310

^{4}km

^{2}, and the annual average temperature is 7 °C. The average temperature is −16 °C in January and 26 °C in July. The spatial and temporal distribution of annual precipitation is inconsistent.

### Problem discussion

*n*groups of ordered sequences with time

*t*as the parameter are obtained: . Sequences consist of the monthly runoff of Kenswat station from 1955 to 2011, and the monthly high and low extreme runoffs were evaluated. A total of 672 months of data were used for model training. In the interim prediction, the runoff of 12 months was predicted. 80% of the data were used as the training set, 18.2% as the verification set, and 1.8% as the prediction set. In the long-term forecast, we predicted the runoff of 24 months, using 80% of the data as the training set, 16.4% as the verification set and 3.6% as the prediction set. In recent years, owing to the strong influence of climate change and human activities, the trend component and transient component of the runoff time series exhibited strong non-stationarity and randomness. The runoff non-stationary series can be simplified as , where and are functions determined at time

*t*, representing the mathematical expectation and variance of the change in over time, and is a stationary stochastic process. Therefore, the analysis of runoff time series is a process based on extracting a deterministic function and processing a random function . According to the rules of monthly runoff and extreme runoff evolution, the physical causes of runoff series formation are inferred and possible values of future runoff are predicted, especially in the medium and long terms. Because the runoff process is a complex non-stationary random process, it is very important to fit the nonlinear process of the model when making prediction fitting. The general form of a hydrological time series model is , where is the random disturbance term in the model. If the random disturbance term is a white noise, the model is denoted as

The main goal is to use a model with strong fitting capability and long short-term memory (LSTM) capability to explore the evolution law of a hydrological sequence and then predict the future runoff.

## STUDY METHOD

### Strategies for LCCSA

The CSO algorithm is an intelligent algorithm used to solve the NP-complete (NP-C) problem proposed by *MENG* in 2014 that is inspired by the hierarchy and intelligent behavior of foraging chickens (Meng *et al.* 2014). The CSO algorithm abstracts the hierarchical order and foraging behavior of groups of chickens and transforms the solution of a global optimization problem into a process involving *N* individual chicken groups consisting of one rooster and a set of hens and chicks foraging in a space defined by . For each iteration, the individuals with corresponding roles in a chicken group move according to their corresponding speed and position update mode to achieve the goal of moving the entire chicken group closer to the food. In the iterative process, roosters independently move to a position where the fitness value continuously improves, and the position is updated based on Equation (1). In the algorithm, hens can follow the roosters in any subgroup to find food during the foraging period. They can also steal the food found by other chickens. Moreover, the adaptability of hens with higher fitness values is stronger than that of hens with low fitness values. Therefore, in the iterative process, the position update of hens is relatively complex, and the update strategy followed is defined by Equation (2). From the position update strategy, roosters with higher are more likely to find food. In each subpopulation, all chicks follow their mothers to search for food. In the process of evolution, the position of the chicks is constantly updated, and the updating rules are applied using Equation (3). The global optimal solution of the problem is to select the chicken that has the strongest foraging ability and is most likely to find food – that is, the target solution. The search method for the solution under random initialization is shown in Figure 3(a). As the problem size increases, the CSO algorithm is likely to fall into a local extremum because the initial solution set is randomly generated. A good initial solution set selection strategy can effectively prevent the CSO algorithm from falling into a local extremum.

*m*is different. To eliminate the influence of the initial value as much as possible, we first let the system iterate

*t*times, and then use a chaotic sequence to generate the -th generation initial population.

*Gaussian*distribution satisfying and and represents the fitness value of the rooster.

### LSTM dynamic neural network

*et al.*2013). When a traditional recurrent neural network learns gradients, the gradient disappears or explodes because of the weight association characteristics in the recursive layer, resulting in poor or non-convergence of the entire network, which negatively affects network performance. LSTM adopts constant error carousels (CEC) to combat gradient disappearance (Hochreiter & Schmidhuber 1997), which introduces two problems. One is input weight conflicts, and the other is output weight conflicts. Therefore, adding control gates in the hidden layer neurons is used to handle different gradient signals in LSTM. That is, LSTM uses special and complex hidden neurons to provide improvements over traditional RNNs. The introduction of an input gate, forget gate, and output gate not only effectively remembers short-term signals but also provides a strong memory function for the long-term signals in learning data. Meanwhile, for longer-term useless signals, its gradient information can be effectively controlled by the forget gate. This control method effectively solves the decaying gradient signal problem in RNN because the current decision is not allowed to rely on the earlier gradient signal.

### Strategies for LSTM-LCCSA

#### Parameter calibration

To study the evolution of monthly runoff, data were divided according to different lag time series. Two sets of time series, and , were selected by a 5-month sliding window as the basic unit. In this case, and . included 136 group samples, contained 171 group samples, and each group contained 12 months of monthly average data. When the prediction period is mid-term, the prediction period was 12 months, including 135 groups of training samples. When the prediction period was long-term, it was 24 months, including 169 groups of training samples. In the mid- to long-term forecast period, the problems affecting overall prediction accuracy are mainly poor prediction when large evolutions of high and low runoff occur in special months, which lead to an increase of the overall prediction error. It is necessary to study the special runoff series to improve the prediction accuracy. Using the sliding window method, a 5-month window was selected as the basic unit to correspond with and in the strict time scale, and the special runoff series was then divided into two groups: and , where and . is a sample set of special series including 136 groups within a mid-term prediction period and is a sample set including 171 groups within a long-term prediction period. Under the strict time scales of *T* and *S*, 135 groups of the mid-term training set and 169 groups of the long-term training set were divided.

To compare the effectiveness of LSTM-LCCSA in hydrological series processing, various current popular prediction algorithms were selected to compare results. Selecting the chaotic mapping support vector machine (CM-SVM) (Tharwat & Hassanien 2018), genetic algorithm-simulated annealing algorithm (GA-SAA) (Zhong 2017), mixture regression algorithm (MRA) (Blanchard 1999), and the genetic algorithm optimization feedforward neural network (GA-FFNN) (Tsai *et al.* 2006), four data-driven models for sequence prediction were used for comparison. The comparison indexes are as follows: mean value (), mean square deviation (), coefficient of variation (), coefficient of skewness (), square of correlation coefficient (), mean absolute error (), range (), rank comprehensive score (), number of parameter calibrations (), and model training time (). The calculation method for each index is shown in Table 1.

#### Data pre-processing

After combining the basic data of impact factors from different sources, many autocorrelations and information redundancy were observed between the factors. This could lead to a decrease in the accuracy of the sequence prediction and reduce the robustness of the model. Some noise sources can make the model extremely sensitive to changes in some variables during the prediction process. Therefore, correlation analysis and factor analysis (Bradford 1976) were applied to sequence samples to remove any autocorrelations among factors in the samples, reduce information redundancy between factor sets, and remove abnormal noise caused by measurement. Data pre-processing is primarily employed to develop a new set of factor patterns.

Through factor analysis pre-processing, the resulting 20-dimensional impact factors are combined into four types of common factor patterns (F1, F2, F3, and F4). The load matrix of the 20 physical factors to the common factor pattern set is rotated by variance maximization, and the results are shown in Table 2. After factor analysis, the original high-dimensional information data are greatly reduced, which in turn reduces the noise caused by collinear redundancy and strong correlations among the original factors. The four common factor pattern sets have certain mathematical statistical characteristics and satisfy the weighted linear combination of physical factors under certain laws, which have strong practical interpretation significance.

. | F1 . | F2 . | F3 . | F4 . |
---|---|---|---|---|

X1 | 0.049 | 0.037 | 0.0033 | −0.0001 |

X2 | 0.012 | 0.021 | −0.0017 | 0.00011 |

X3 | 0.001 | − 0.019 | −0.0021 | −0.0002 |

X4 | 0.001 | 0.012 | 0.0005 | −0.0001 |

X5 | 0.0007 | 0.0004 | 0.0002 | 0.00001 |

X6 | 0.015 | 0.016 | −0.0007 | 0.0001 |

X7 | 0.0003 | 0.012 | 0.0021 | 0.00001 |

X8 | 0.0042 | 0.072 | 0.0014 | 0.00001 |

X9 | −0.005 | −0.001 | 0.0012 | 0.09001 |

X10 | −0.004 | −0.0001 | 0.019 | 0.00001 |

X11 | 0.015 | −0.002 | 0.078 | 0.00001 |

X12 | 0.031 | −0.0001 | 0.00002 | 0.00011 |

X13 | 0.059 | 0.0002 | 0.023 | 0.00024 |

X14 | 0.025 | 0.0002 | 0.00001 | 0.00013 |

X15 | 0.017 | 0.0002 | 0.00002 | 0.0001 |

X16 | 0.043 | −0.00016 | − 0.016 | 0.0071 |

X17 | 0.001 | −0.0001 | 0.0001 | 0.0011 |

X18 | −0.0002 | 0.00017 | 0.00002 | 0.022 |

X19 | 0.00005 | −0.0001 | 0.00003 | 0.001 |

X20 | 0.0003 | −0.0003 | 0.00002 | 0.0013 |

X21 | 0.091 | 0.056 | 0.0001 | 0.00001 |

Variance contribution rate | 0.426 | 0.389 | 0.117 | 0.065 |

Cumulative contribution rate | 0.426 | 0.815 | 0.932 | 0.997 |

. | F1 . | F2 . | F3 . | F4 . |
---|---|---|---|---|

X1 | 0.049 | 0.037 | 0.0033 | −0.0001 |

X2 | 0.012 | 0.021 | −0.0017 | 0.00011 |

X3 | 0.001 | − 0.019 | −0.0021 | −0.0002 |

X4 | 0.001 | 0.012 | 0.0005 | −0.0001 |

X5 | 0.0007 | 0.0004 | 0.0002 | 0.00001 |

X6 | 0.015 | 0.016 | −0.0007 | 0.0001 |

X7 | 0.0003 | 0.012 | 0.0021 | 0.00001 |

X8 | 0.0042 | 0.072 | 0.0014 | 0.00001 |

X9 | −0.005 | −0.001 | 0.0012 | 0.09001 |

X10 | −0.004 | −0.0001 | 0.019 | 0.00001 |

X11 | 0.015 | −0.002 | 0.078 | 0.00001 |

X12 | 0.031 | −0.0001 | 0.00002 | 0.00011 |

X13 | 0.059 | 0.0002 | 0.023 | 0.00024 |

X14 | 0.025 | 0.0002 | 0.00001 | 0.00013 |

X15 | 0.017 | 0.0002 | 0.00002 | 0.0001 |

X16 | 0.043 | −0.00016 | − 0.016 | 0.0071 |

X17 | 0.001 | −0.0001 | 0.0001 | 0.0011 |

X18 | −0.0002 | 0.00017 | 0.00002 | 0.022 |

X19 | 0.00005 | −0.0001 | 0.00003 | 0.001 |

X20 | 0.0003 | −0.0003 | 0.00002 | 0.0013 |

X21 | 0.091 | 0.056 | 0.0001 | 0.00001 |

Variance contribution rate | 0.426 | 0.389 | 0.117 | 0.065 |

Cumulative contribution rate | 0.426 | 0.815 | 0.932 | 0.997 |

Note: Each index in Table 2 is shown as follows: temperature (X1), humidity (X2), air pressure (X3), flow intensity (X4), wind level (X5), sunshine (X6), radiation (X7), rainfall (X8), evapotranspiration (X9), green coverage evolution (X10), air quality change (X11), urban and rural consumption change (X12), per capita disposable income change (X13), change in per capita net income (X14), total agricultural machinery power (X15), tertiary industry output value change (X16), Arctic Oscillation Index (X17), North Atlantic Oscillation Index (X18), mean grid height in southern China under 500 hPa (X19), mean sea temperature in Japan archipelago and South Pacific (X20), runoff (X21). And the bold values in table 2 represent the values with the maximum weight in the dataset.

Four types of common factors explained 99.7% of the original information. Among them, F1 and F2 have the highest variance contribution rate and the strongest explanatory significance, indicating that human activities have a great influence on runoff evolution. The contribution rate of meteorology and atmospheric circulation is second, further indicating that the evolution of runoff time series must consider the strong influence of human activities because human activities will cause climate change, affect local circulation in the basin control area, lead to changes in precipitation, snowmelt, evapotranspiration, and will cause changes in the underlying sequence. Therefore, the new influence mode F = (F1, F2, F3, F4) should focus on the influence of F1 on the evolution of hydrological sequences.

## RESULTS

Five models: LSTM-LCCSA, CM-SVM, GA-SA, MRA, and GA-FFNN undergo the parameter calibration method presented in Section 2.3.4 and the new impact factor model obtained is discussed in Section 2.3.5. The five models are tested 20 times each, respectively, and the average value of the 20 tests was obtained as the prediction result for each model.

### Analysis of effectiveness

The validity of LSTM-LCCSA was analyzed from two aspects: accuracy and time efficiency. From Figures 7–9, the differences between the five algorithms and the real measured values regarding the overall prediction trend are small. This indicates that the overall forecast is good. The prediction accuracy of the five algorithms was measured using correlation and mathematical statistics between the predicted value and the real value.

The forecast results in indexes in the mid-term are shown in Table 3. From the *R* values, LSTM-LCCSA improves by 3, 3, 5, and 4% compared to CM-SVM, GA-SA, and MRA,GA-FFNN, the is reduced by −0.04, 2.19, 1.62, and 0.22, and is reduced by 6.8, 7.5, 10.1, and 7.6, respectively. Compared with the overall mean of 30.89, the LSTM-LCCSA mean value is closer to the real mean. This indicates that LSTM-LCCSA achieves higher prediction accuracy than the other four models in the mid-term period. Using a different model, Zhang *et al.* (2011) used the differential autoregressive moving average (ARIMA) model to predict annual average runoff, and their SSA-ARIMA model achieved *R* = 0.814. In comparison, the value achieved by LCCSA-LSTM is 17% higher than SSA-ARIMA. Therefore, the prediction effect of the proposed LCCSA-LSTM is better than ARIMA for processing non-stationary sequences. From the perspective of time efficiency, LSTM-LCCSA requires 45 network parameters to be calibrated, while the CM-SVM, GA-SA, MRA, and GA-FFNN network parameters to be calibrated are 33, 28, 21, and 38, respectively. Compared with the other four algorithms, the complexity of LSTM-LCCSA is not much improved. Under the same hardware and software conditions, from the convergence time point of view, LSTM-LCCSA, CM-SVM, GA-SA, MRA, and GA-FFNN take 1,260, 1,180, 1,066, 1,332, and 1,120 s to run, respectively. The training time of LSTM-LCCSA is acceptable; therefore, LSTM-LCCSA has acceptable efficiency.

Mid-term . | . | . | . | . | . | (s) . |
---|---|---|---|---|---|---|

LSTM-LCCSA | 0.99 | 0.46 | 29.62 | 10.36 | 45 | 1,260 |

TOSVM | 0.96 | 0.42 | 28.10 | 17.16 | 33 | 1,180 |

GA-SAA | 0.96 | 2.65 | 27.78 | 17.84 | 28 | 1,066 |

MRA | 0.94 | 2.08 | 27.46 | 20.48 | 21 | 1,332 |

GA-FFNN | 0.95 | 0.68 | 28.03 | 17.92 | 38 | 1,120 |

Mid-term . | . | . | . | . | . | (s) . |
---|---|---|---|---|---|---|

LSTM-LCCSA | 0.99 | 0.46 | 29.62 | 10.36 | 45 | 1,260 |

TOSVM | 0.96 | 0.42 | 28.10 | 17.16 | 33 | 1,180 |

GA-SAA | 0.96 | 2.65 | 27.78 | 17.84 | 28 | 1,066 |

MRA | 0.94 | 2.08 | 27.46 | 20.48 | 21 | 1,332 |

GA-FFNN | 0.95 | 0.68 | 28.03 | 17.92 | 38 | 1,120 |

The forecast result indicators for long-term prediction are shown in Table 4. Compared with CM-SVM, GA-SA, MRA, and GA-FFNN, LSTM-LCCSA increased by 5, 7, 6, and 2% in , the decreased by 2.99, 2.09, −0.55, and 2.71, decreased by 5.55, 5.40, 11.25, and 6.07, respectively, and the mean value was closer to the real mean value of 32.93.

Long-term . | . | . | . | . | . | (s) . |
---|---|---|---|---|---|---|

LSTM-LCCSA | 0.98 | 1.97 | 28.98 | 13.07 | 45 | 1,220 |

TOSVM | 0.93 | 4.96 | 24.50 | 18.62 | 33 | 1,010 |

GA-SAA | 0.91 | 4.06 | 24.67 | 18.47 | 28 | 1,165 |

MRA | 0.92 | 1.42 | 25.02 | 24.32 | 21 | 1,310 |

GA-FFNN | 0.96 | 4.68 | 25.26 | 19.14 | 38 | 1,072 |

Long-term . | . | . | . | . | . | (s) . |
---|---|---|---|---|---|---|

LSTM-LCCSA | 0.98 | 1.97 | 28.98 | 13.07 | 45 | 1,220 |

TOSVM | 0.93 | 4.96 | 24.50 | 18.62 | 33 | 1,010 |

GA-SAA | 0.91 | 4.06 | 24.67 | 18.47 | 28 | 1,165 |

MRA | 0.92 | 1.42 | 25.02 | 24.32 | 21 | 1,310 |

GA-FFNN | 0.96 | 4.68 | 25.26 | 19.14 | 38 | 1,072 |

The reason LSTM-LCCSA exhibits greater accuracy than and comparable complexity to the other four algorithms is that chaotic mapping produces more uniform effective initial values, improving CSA's capability to escape a local extremum solution. Concurrently, compared with ordinary RNN, LSTM solves the problem of input weight conflicts and output weight conflicts by using control gates. The network can not only remember the short-term learning signals but can also learn the longer signals. Any early useless signals can be controlled using the forget gate, so that later useless signals cannot disturb the current sequence judgment. Therefore, LSTM-LCCSA has high accuracy and strong prediction validity in time series prediction problems.

### Analysis of stability

In actual predictions, runoff evolution will be affected by many uncertain factors and measurement errors, which will increase uncertainty in hydrological series prediction. Therefore, the prediction stability of the model must be analyzed. The forecast stability indexes in the mid-term are shown in Table 5. The relative dispersion degree of the prediction sequence and the measured sequence can reflect the variation of runoff, and the dispersion between the prediction sequence and the measured sequence can be observed.

Mid-term . | . | . | . | . |
---|---|---|---|---|

LSTM-LCCSA | 0.34 | 2.12 | 86.40 | 88 |

TOSVM | 0.61 | 4.31 | 80.14 | 83 |

GA-SAA | 0.64 | 3.90 | 79.03 | 79 |

MRA | 0.74 | 2.31 | 77.13 | 77 |

GA-FFNN | 0.63 | 3.70 | 77.65 | 82 |

Mid-term . | . | . | . | . |
---|---|---|---|---|

LSTM-LCCSA | 0.34 | 2.12 | 86.40 | 88 |

TOSVM | 0.61 | 4.31 | 80.14 | 83 |

GA-SAA | 0.64 | 3.90 | 79.03 | 79 |

MRA | 0.74 | 2.31 | 77.13 | 77 |

GA-FFNN | 0.63 | 3.70 | 77.65 | 82 |

Regarding *C*_{v}, LSTM-LCCSA decreased by 0.27, 0.30, 0.40, and 0.29 compared with CM-SVM, GA-SA, MRA, and GA-FFNN, respectively. This result indicates that the sequence predicted by LSTM-LCCSA is closer to and is concentrated around the actual measured sequence. The value of *C*_{S} can reflect the degree of symmetry of the time series on both sides of the mean, and it is an important parameter for measuring the degree of symmetry of the sequence. If *C*_{S} = 0, the probability of the sequence being symmetrical on both sides of the mean value is very high, and the stability of the sequence is stronger. *C*_{S} > 0 indicates positive sequence bias, while *C*_{S} < 0 indicates negative sequence bias. Compared with the other four models, the *C*_{S} value of LSTM-LCCSA is closer to 0, which indicates that the symmetry of the prediction sequence on both sides of the mean value is more concentrated and the sequence stability is better. The larger range of the predicted value indicates the weaker sensitivity of the prediction model. To further explore the stability of the model, we use the rank scoring method, where the higher rank score indicates the better prediction effect of the model. LSTM-LCCSA scored higher than CM-SVM, GA-SA, MRA, and GA-FFNN in the *RS* by 5, 9, 11, and 6 points, respectively.

According to the analysis of the various indexes, the prediction stability of the LSTM-LCCSA model is acceptable when the prediction period is mid-term. The stability indexes under long-term prediction are shown in Table 6. Compared with CM-SVM, GA-SA, MRA, and GA-FFNN, the of LSTM-LCCSA is lower by 0.34, 0.36, 0.42, and 0.38, is closer to 0, and the comprehensive rank score is higher by 2, 8, 6, and 3, respectively. Therefore, the prediction using the LSTM-LCCSA model is considered to be stable and reliable in the long-term.

Long-term . | . | . | . | . |
---|---|---|---|---|

LSTM-LCCSA | 0.40 | 4.21 | 84.68 | 82 |

TOSVM | 0.74 | 6.51 | 81.2 | 80 |

GA-SAA | 0.76 | 5.82 | 82.71 | 74 |

MRA | 0.82 | 4.68 | 81.4 | 76 |

GA-FFNN | 0.78 | 5.78 | 81.79 | 79 |

Long-term . | . | . | . | . |
---|---|---|---|---|

LSTM-LCCSA | 0.40 | 4.21 | 84.68 | 82 |

TOSVM | 0.74 | 6.51 | 81.2 | 80 |

GA-SAA | 0.76 | 5.82 | 82.71 | 74 |

MRA | 0.82 | 4.68 | 81.4 | 76 |

GA-FFNN | 0.78 | 5.78 | 81.79 | 79 |

### Analysis of generality

To further analyze the generality of LSTM-LCCSA under mid- to long-term forecasting, the monthly average runoff data from 1974 to 2017 and the pre-processed physical factor data of Kuitun River Basin, another typical basin in northern Xinjiang, were used for algorithm training. The prediction is carried out by the same lag timing division method as discussed in Section 3.1., and the forecast results of the Kuitun River Basin are obtained under different prediction periods.

. | . | . | . | . | (s) . | . | . | . | . |
---|---|---|---|---|---|---|---|---|---|

Mid-term | 0.98 | 1.32 | 27.35 | 13.95 | 1,033 | 0.34 | 2.11 | 65.7 | 84 |

Long-term | 0.93 | 3.72 | 26.27 | 17.17 | 1,202 | 0.59 | 5.31 | 68.3 | 79 |

. | . | . | . | . | (s) . | . | . | . | . |
---|---|---|---|---|---|---|---|---|---|

Mid-term | 0.98 | 1.32 | 27.35 | 13.95 | 1,033 | 0.34 | 2.11 | 65.7 | 84 |

Long-term | 0.93 | 3.72 | 26.27 | 17.17 | 1,202 | 0.59 | 5.31 | 68.3 | 79 |

In the long-term, the monthly runoff of the Kuitun River Basin is also predicted. Figure 10(b) shows the forecast results of monthly runoff. Under the same time scale, Figure 11(b) shows the forecast results of the special runoff series. After integrating the two figures, the overall forecast trend of the Kuitun River Basin in the long-term is shown in Figure 12(b). It can be seen that the forecast results of almost every month fall within the 85% confidence coverage. From the overall trend results in the long-term, the forecast results are acceptable. Table 7 shows the validity and stability indexes of the prediction series in the long-term prediction period. From the index results, the prediction effect of LSTM-LCCSA is suitable in the long-term, further indicating that the prediction universality of the model is strong.

Therefore, after this comprehensive analysis, it was concluded that LSTM-LCCSA has strong universality in the mid- to long-term runoff time series forecast of the river basins in northern Xinjiang and therefore can be applied to the runoff forecast of other typical basins in northern Xinjiang.

## CONCLUSIONS

In summary, a hybrid LSTM-LCCSA algorithm is proposed to solve the mid- to long-term runoff time series prediction problem. Compared with CM-SVM, GA-SA, MRA and GA-FFNN, LSTM-LCCSA exhibits higher accuracy and stability in predicting mid- to long-term runoff time series. The predictive stability of LSTM-LCCSA was tested by Manas river data and the generality was verified by Kuitun River. The LSTM-LCCSA hybrid algorithm is proposed to solve the mid- to long-term runoff time series prediction problem in the northern basin of Xinjiang, China. The number of parameter calibration frequency of LSTM-LCCSA model is higher than that of the other four models. In the future, we will investigate the method of improving parameter calibration, and continue to apply LCCSA-LSTM to the prediction of watershed runoff series. In the future, LCCSA-LSTM will continue to be applied to the prediction of runoff series in the basin, the further exploration of the evolution pattern and main influencing factors of runoff in months will be studied to improve the practicability of LCCSA-LSTM.

## ACKNOWLEDGEMENTS

The financial support from the Special application science and technology project of the 7th division in Xinjiang Bingtuan (2021A03008) and the Special application science and technology project of the 1st division in Xinjiang Bingtuan (2022A007) are gratefully acknowledged.

## AUTHOR CONTRIBUTIONS

W.Y.Y. and C.X.M. conceptualized the whole article, developed the methodology, and bought the software; X.G.G. and X.T.F. rendered support in data curation and wrote the original draft preparation; X.T.F. and C.X.M. visualized the process, investigated the project; J.F.L. supervised the article; J.F.L. and W.Y.Q. bought the software, validated the article; J.F.L. and W.Y.Q. wrote the review and edited the article.

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories.

## CONFLICT OF INTEREST

The authors declare there is no conflict.