## Abstract

Current groundwater prediction models often exhibit low accuracy and complex parameter adjustment. To tackle these limitations, a novel prediction model, called improved Aquila optimizer bi-directional long-term and short-term memory (IAO-BiLSTM) network, is proposed. IAO-BiLSTM optimizes the hyperparameters of the BiLSTM network using an IAO algorithm. IAO incorporates three novel enhancements, including population initialization, population updating, and global best individual updating, to overcome the drawbacks of current optimization algorithms. Before making predictions, the challenge posed by the highly nonlinear and non-stationary characteristics of groundwater level signals was addressed through the application of a wavelet multi-scale analysis method. Using a landslide site in Zhejiang Province as an example, a monitoring system is established, and continuous wavelet transform, cross-wavelet transform, and wavelet coherence analysis are employed to perform multi-scale feature analysis on a 2-year dataset of rainfall and groundwater depth. The findings reveal that the groundwater depth of monitoring holes exhibits similar high energy resonating periods and phase relationships, strongly correlating with rainfall. Subsequently, IAO-BiLSTM is employed to predict groundwater depth, and its results are compared with seven popular machine learning regression models. The results demonstrate that IAO-BiLSTM achieves the highest accuracy, as evidenced by its root mean squared error of 0.25.

## HIGHLIGHTS

The wavelet multi-scale analysis method overcomes the limitation of the traditional qualitative analysis method.

The wavelet multi-scale analysis method successfully reveals the multi-scale characteristics and the most significant influencing factors.

An improved Aquila optimizer bi-directional long-term and short-term memory (IAO-BiLSTM) network can predict groundwater level reliably and effectively.

IAO-BiLSTM beats the current seven hot prediction models and showed good performance.

## INTRODUCTION

Groundwater changes can be an important contributing factor to landslides, especially in China, where 90% of these slope failures are linked to groundwater. When groundwater levels rise and fall rapidly, it compromises the shear strength of the slope rock and soil, intensifying the risk of landslides (Seifollahi-Aghmiuni *et al.* 2022). Therefore, it is of utmost importance to understand the dynamic patterns of groundwater levels in landslide-prone areas and to predict their trends efficiently and accurately.

As a complex nonlinear dynamic system, the change in the groundwater level is usually related to rainfall, reservoir water level, engineering activities, and the geological conditions of the landslide itself. It is of great significance to reveal the influencing factors of groundwater level change and to analyze the real-time monitoring and prediction of the groundwater depth potentially prone to landslides (Cao *et al.* 2020). Based on the groundwater monitoring data of the Shuping landslide in the Three Gorges Reservoir area of China, Zhao *et al.* (2016) analyzed the response of landslide groundwater, rainfall, and reservoir water level, and then constructed a classified regression tree model to predict the groundwater level. Based on the monitoring data of groundwater level, rainfall, and reservoir water level of the Baijiapu landslide in China from 2007 to 2013, Duan *et al.* (2019) qualitatively analyzed the characteristics, significant relationship, and lag relationship of groundwater level, concluding that rainfall is the most significant influence, which provides a basis for the prediction of groundwater level. Wei *et al.* (2020) conducted a qualitative analysis of the correlation between groundwater level, rainfall, and surface displacement in a deep landslide in China, utilizing monitoring data. They concluded that the movement of the landslide is largely dependent on the groundwater, which is, in turn, affected by rainfall. From this, they were able to predict the fluctuations in groundwater levels. Although monitoring data can be used to qualitatively assess the correlation between groundwater and its contributing factors in the aforementioned study, it is difficult to systematically quantify highly nonlinear and non-stationary signals. The lack of statistical analysis and rigor further impede the ability to provide sufficient support for predicting groundwater levels (Truong *et al.* 2021).

In recent years, researchers have increasingly employed advanced mathematical methods to investigate the dynamic changes and correlations of groundwater, among which wavelet analysis is one of the most commonly used approaches. Based on the wavelet theory, this method can analyze the characteristic and correlation relationships of two time series in the time–frequency domain. Wavelet analysis has gained wide acceptance and substantial utilization in diverse fields, including meteorology, hydrology, astronomy, and medicine (Kainthura & Sharma 2022; Su *et al.* 2023). In this study, a long-term and multi-point field monitoring approach was employed to monitor the dynamic changes of groundwater in slope areas, and the wavelet theory was used to conduct a multi-scale analysis of groundwater depth at landslide points (Sharma *et al.* 2021). By utilizing methods such as continuous wavelet transform (CWT), cross-wavelet transform (XWT), and wavelet coherence analysis (WTC), the response characteristics, periodic scale characteristics, dominant resonance period, significant periods, lag relationships, and multi-scale coherence between groundwater depth and influencing factors were revealed. The utilization of this approach enabled a comprehensive investigation of the multi-scale characteristics, evolution patterns, and influence relationships between groundwater depth and influencing factors. Such findings could potentially offer new research directions for studying the dynamic changes of groundwater level time series and serve as a foundation for selecting influential factors in predicting groundwater levels.

Groundwater levels are highly nonlinear and volatile, making accurate prediction difficult (Vadiati *et al.* 2022). Though deterministic models, such as water balance, hydrogeology analogy, analytical methods, numerical simulations, and physical simulations, can provide reliable predictions when the geological conditions are not complex and involve a single water layer, they are not well suited to mid-term and long-term forecasts (Nikvar Hassani *et al.* 2016). Likewise, traditional stochastic models like time series, gray models, random differential equations, and fuzzy mathematics present their difficulties, such as difficulty in model identification, limited ability to capture the nonlinear dependence of historical data, and a large number of parameters to be estimated (Wang *et al.* 2019). The stochastic model based on machine learning is far superior to other methods in terms of flexibility, stability, and accuracy, making it the most widely used and effective prediction technique for groundwater levels (Habibi *et al.* 2019; Li *et al.* 2023). Considering the influencing factors of rainfall and reservoir water level, Cao *et al.* (2020) leveraged the genetic algorithm-support vector machine (GA-SVM) to predict Tangjiao landslide's groundwater levels in the Three Gorges Reservoir area of China, demonstrating the GA-backpropagation (BP) neural networks as a reliable prediction method. Wei *et al.* (2020), while taking rainfall factors into account, adopted the particle swarm optimization (PSO)-SVM and the physical seepage model to assess a landslide in China and found PSO-SVM to yield higher accuracy than the physical seepage model. Furthermore, Huang *et al.* (2017) compared the prediction accuracy of PSO-SVM based on the chaos theory, PSO-SVM based on linear correlation analysis, and BP based on the chaos theory and discovered that the chaotic PSO-SVM model performed better than the other two due to the chaotic characteristics of groundwater level.

Optimization algorithms such as PSO and GA can be used to optimize the hyperparameters of machine learning models like BP and SVM and have been successfully applied to the prediction of landslide groundwater. However, these algorithms have certain drawbacks, such as a limited search space, susceptibility to local optima, overfitting, slow convergence, weak generalization performance, and limited learning ability for time series features (Bozorg-Haddad *et al.* 2017; Xiao *et al.* 2023), all of which hamper the accuracy of the model. Therefore, this study adopts an improved Aquila optimizer (IAO) algorithm to optimize the bi-directional long-term and short-term memory (IAO-BiLSTM) network model for the prediction of landslide groundwater. Compared with the Aquila optimizer (AO) algorithm, the IAO algorithm is improved in three aspects: population initialization based on Tent chaotic mapping, global optimal individual iteration based on an adaptive weight factor, and population iteration based on reverse differential evolution. It overcomes the shortcomings of the traditional intelligent optimization algorithm, such as premature convergence, susceptibility to local optima, and slow iteration, thereby enabling the prediction model to find the optimal super-parameters. In addition, the BiLSTM network is more efficient than recurrent neural networks (RNNs) in processing groundwater sequence data due to the ability to retain important information for long-term memory and to a certain extent forget other information. In recent years, BiLSTM has been widely used in groundwater prediction (Ghasemlounia *et al.* 2021), but there are still few studies on groundwater level measurement in landslide sites. To verify the performance of the algorithm and its feasibility and scientific validity in the field of landslide groundwater prediction, this study will compare the prediction of landslide groundwater depth based on a case study of a landslide in Zhejiang, China, using IAO-BiLSTM and seven currently commonly used prediction models, and then verify the effectiveness of the method.

## METHODS

### Wavelet theory

#### Continuous wavelet transform

*s*are the position and scale parameters, respectively, which determine the exact locality of the mother wavelet and its scaling factor, and is a cluster of wavelet functions obtained by stretching and translation. There are many types of mother wavelets, such as the Morlet wavelet, Haar wavelet, Daubechies wavelet, Mexican Hat wavelet, Meyer wavelet, and so on. Among them, the Morlet wavelet is widely used because of its easy-to-interpret output results; thus, this study uses the Morlet wavelet as the mother wavelet (Di

*et al.*2021).

#### Cross-wavelet transform

*X*and

*Y*is defined as Equation (2). In Equation (2), * is a complex conjugate, and the corresponding crossover wavelet power spectral density is . The larger its value is, the more evidence there is of a shared area of high energy between the two time series and a strong correlation between them.

#### Wavelet coherence analysis

*S*is the smoothing operator. The significance of the wavelet coherence spectrum is tested by the Monte Carlo method. The range of wavelet correlation value is 0–1, which can be interpreted as the local correlation coefficient in the time–frequency domain. The larger the value is, the more strongly the correlation is established (Grinsted

*et al.*2004).

### IAO-BiLSTM

*t*and , respectively; is the activation function of the cellular output of the moments ; and and are the new cellular and the cellular memory status, respectively.

BiLSTM is proposed based on the LSTM network; each memory block of BiLSTM contains two LSTM layers, the input time sequence at each moment, the two hidden layer states and obtained from the forward LSTM layer, and the backward LSTM layer which is opposite to the two time sequences, and then the two hidden layer states are connected to get the same outputs , the forward LSTM and the backward LSTM layers can get the past information and the future information of the input sequence, respectively. BiLSTM uncovers the relationship between the current data and the preceding as well as succeeding data points. This characteristic enhances the model's accuracy and improves data utilization efficiency. Compared to LSTM, BiLSTM is particularly suitable for handling data with significant temporal features, such as groundwater data.

The AO algorithm simulates the different hunting methods of different prey, namely, selecting the search space (*X*_{1}) by using vertical bending, exploring the search space (*X*_{2}) through the contour flight of short gliding attack, exploring the convergent search space (*X*_{3}) by the low-altitude flight of slow descent attack, and diving by walking and grabbing prey (*X*_{4}). Due to space limitations, the specific principles and computational steps of the AO algorithm can be found in the article of Abualigah *et al.* (2021).

*et al.*2022). In this study, an improved AO (IAO) algorithm was utilized to enhance the research methodology. IAO employs population initialization based on Tent chaotic mapping, global optimal individual iterative update based on adaptive weight primer, and population iterative update with reverse differential evolution to increase the diversity of the population, improve the activity range of the population, and enhance the optimization efficiency and robustness of the algorithm, thus minimizing the chances of premature convergence and local optimization. The implementation steps for optimizing the BiLSTM model using the IAO algorithm are outlined below, and the detailed process is depicted in Figure 2. The key mathematical notations used in Equations (8)–(14) are listed in Table 1.

Symbol . | Description . |
---|---|

The position of the (i + 1)th Aquila individual in the jth dimension after Tent chaotic mapping | |

The position of the ith Aquila individual in the same dimension | |

The population size | |

The problem dimension | |

The lower bounds of the tuned-parameter optimization problem | |

The upper bounds of the tuned-parameter optimization problem | |

The individual after the perturbation | |

The individual to be chaotically perturbed | |

The generated chaotic perturbation variable | |

The ith individual in the jth dimension of the population | |

The ith individual in the jth dimension of the inverse population | |

The lower bounds of the problem in the jth dimension | |

The upper bounds of the problem in the jth dimension | |

A random number in the interval [0,1] | |

The solution at the (t + 1)th iteration | |

The optimal solution at the tth iteration | |

The current iteration | |

The maximum number of iterations | |

The base of the natural logarithm |

Symbol . | Description . |
---|---|

The position of the (i + 1)th Aquila individual in the jth dimension after Tent chaotic mapping | |

The position of the ith Aquila individual in the same dimension | |

The population size | |

The problem dimension | |

The lower bounds of the tuned-parameter optimization problem | |

The upper bounds of the tuned-parameter optimization problem | |

The individual after the perturbation | |

The individual to be chaotically perturbed | |

The generated chaotic perturbation variable | |

The ith individual in the jth dimension of the population | |

The ith individual in the jth dimension of the inverse population | |

The lower bounds of the problem in the jth dimension | |

The upper bounds of the problem in the jth dimension | |

A random number in the interval [0,1] | |

The solution at the (t + 1)th iteration | |

The optimal solution at the tth iteration | |

The current iteration | |

The maximum number of iterations | |

The base of the natural logarithm |

**Step 1**: Initialize parameters, such as setting the population size to 30 and the maximum number of iterations to 100.

**Step 2**: Initialize the population based on the Tent chaotic mapping. The population is initialized using Equations (8)–(10). Equation (9) represents the transformation of the value , generated by Equation (8), into the value range of the variable being optimized. Specifically, in the

*j*-dimensional space, denotes the newly generated value. Additionally, Equation (10) introduces chaotic perturbation.

**Step 4**: Calculate the fitness of the initial population and the opposite population separately. *N* individuals with the lowest fitness are selected from the combined sets of both populations to form the optimal initial population.

**Step 5**: Calculate the fitness value of the optimal initial population and subsequently evaluate against predetermined criteria. If the criteria are met, the position of the Aquila is updated using the optimization process *X*_{1} or *X*_{2}. On the other hand, if the criteria are not met, the position of the Aquila is updated using the optimization process *X*_{3} or *X*_{4}. Additionally, the global optimal position and its corresponding fitness are updated accordingly.

**Step 6**: Generate new solutions based on the opposition-based differential evolution strategy after each individual updates their position. This involves applying the mutate, crossover, and select steps from the GA to the original solutions. Subsequently, new solutions are selected along with the best individual from the original solutions for the next iteration. Simultaneously, the global optimal position and its corresponding fitness are updated.

**Step 7**: Generate individual opposite solutions by applying Equation (12) based on the OBL strategy. The best individual from the original solutions is then selected to proceed to the next iteration along with these opposite solutions. Simultaneously, the global optimal position and its corresponding fitness are updated.

**Step 9**: Determine whether the current iteration count has reached the maximum predefined number of iterations. If it has, the optimal parameters of the model at the current iteration count are outputted. Otherwise, the algorithm proceeds to Step 5 and continues iterating until the maximum iteration count is reached.

**Step10**: Assign the optimal parameters produced in Step 9 to the BiLSTM model. Subsequently, the predictive factors are inputted, and the model is executed to generate the corresponding output predictions.

## STUDY AREA AND MONITORING SYSTEM

### Research background of the study area

The landslide strata are mainly composed of quaternary residual soil formation (*Q*), upper Jurassic Huangjian formation (*J*_{3n}), lower Ordovician Ningguo formation (*O*_{1n}), and lower Ordovician Yinzhubu formation (*O*_{1y}). The landslide area belongs to the subtropical monsoon climate zone with a prevailing northeasterly wind throughout the year. The rainy season extends from February to mid-July, particularly from May to June and early July, while the dry season lasts from late July to January the following year. From August to October, it is occasionally affected by typhoons, and there are heavy rains, which often lead to a drought in autumn. The average annual rainfall in the study area is 1,804 mm and the evaporation is 1,424 mm. Groundwater in landslide areas can be mainly divided into two types: pore water in loose rocks and fracture water in bedrock. The thickness of the water-bearing layer of pore water in loose rocks is about 2.0–6.1 m, and it is replenished by atmospheric precipitation and irrigation return flow and is discharged in the form of springs or lateral recharge. Fracture water in bedrock can be divided into weathering fracture water and structural fracture water according to the occurrence conditions of groundwater. The thickness of the water-bearing layer of weathering fracture water is relatively small and mostly within 10 m, while structural fracture water mainly occurs in the structural fractures of the Ordovician strata. Both types of fracture water are mainly replenished by atmospheric precipitation and high mountain condensation water and are discharged in the form of springs.

There were several cracks in the study area between August 2014 and November 2015, which led to landslides at the site in early 2016. Through detailed mechanism analysis, Sun *et al.* (2019) proved that the landslide is a deep one formed by the reactivation of the local ancient landslide under the influence of groundwater and construction activities. To ensure the stability of the groundwater depth in the area, a 305 m-long drainage hole was constructed from August 2016 to November 2016, which was able to lower the groundwater level in the landslide.

Groundwater in landslide areas mainly comes from atmospheric precipitation and high mountain condensation water and is also influenced by engineering activities such as drainage tunnels. As a highly nonlinear and non-stationary signal (Kousali *et al.* 2022), the dynamic changes of groundwater have complex relationships that require the selection of important influencing factors for groundwater depth prediction in conjunction with monitoring systems. Additionally, it is necessary to confirm whether rainfall can be a significant predictor for groundwater depth prediction.

### Monitoring system

Systematic monitoring was conducted in a landslide area from January 1, 2016 to December 31, 2017 to ensure the safety of construction. The monitoring content mainly includes three parts: rainfall, groundwater depth, and surface displacement (Figure 3(b)). The specific monitoring methods and instruments are as follows: (1) for rainfall monitoring, automatic tipping bucket rain gauge real-time monitoring is used to monitor the rainfall in the region. (2) For groundwater depth monitoring, five water level monitoring holes (W1–W5) are arranged in the landslide area, and then water level gauges are arranged in the water level monitoring holes to monitor the change of groundwater depth in the landslide. It should be noted that the monitored groundwater level refers to the groundwater depth. (3) For surface displacement, seven GPS stations (M1–M7) are arranged in the slope area, and then the displacement of the monitoring point is obtained by GPS satellite positioning measurement.

## WAVELET MULTI-SCALE ANALYSIS

### Analysis of dynamic monitoring results

To explore the dynamic monitoring results and then provide an analysis basis for the subsequent multi-scale analysis between rainfall and groundwater depth, the rainfall, surface displacement, surface displacement rate (SDR), and the groundwater depth of five water level monitoring holes (W1–W5) were selected to analyze the monitoring data from January 5, 2016 to December 31, 2017. The monitoring results of surface displacement and SDR are derived specifically from the M5 site, which is the area with the greatest landslide displacement during the monitoring period and can fully represent the landslide's movement (Wei *et al.* 2019).

In addition, it can be seen from Figure 4 that after the construction of the drainage tunnel, the groundwater depth of W1 changed most greatly, rising from 15.04 mm/day on July 2, 2016 to 29.07 mm/day on November 10, 2016. W1 is the nearest water level monitoring hole among the five groundwater depth monitoring holes, indicating the significant role of the drainage hole in drainage.

### Response change analysis

Based on the analysis of the wavelet coefficients' time–frequency characteristics, in conjunction with Figure 5(a), the red noise test at the 95% confidence level indicates that the rainfall during the study period exhibits prominent oscillation periods primarily at time scales of 10–25 and 365 days. The strong rainfall behavior occurred on the 250th day and 550th day in this area, which indicated that the scale of the 365 day cycle was significant. Furthermore, Figure 5(b)–5(f) shows that W1–W5 all have the main oscillation period time scale of 25–45 days. During the period of rainfall, the groundwater depth of W1–W5 has a strong signal, which indicates that groundwater depth is significantly affected by rainfall. The small scale covers the time–domain discontinuity and large span, while the large scale covers time–domain continuity and small span. It is worth noting that the periodic time scale, oscillation intensity, and response of the groundwater depth of W2–W5 are similar, indicating that the groundwater depth of W2–W5 is greatly affected by rainfall, while W1 weakens the response relationship due to the influence of the drainage tunnel. In summary, due to the limitations of topography and hydroclimatic conditions, the groundwater depth in the landslide site has no other main sources of supply except rainfall. Therefore, it can be preliminarily concluded that rainfall is the main driving factor leading to the significant change of groundwater depth during the study period.

### Time lag relation analysis

### Coherence analysis

Based on the wavelet multi-scale analysis system mentioned earlier, combined with the time–domain and frequency–domain multi-scale features analysis of the main oscillation period, main resonance period, significant period, time lag effect, coherence, and differences between groundwater depth and rainfall, it can be concluded that rainfall is the most important influencing and driving factor for groundwater depth prediction at the Qili landslide site. Therefore, this study selects rainfall as the input factor for predicting groundwater depth at the Qili landslide site.

## PREDICTION OF GROUNDWATER DEPTH

### Identification of lag period

Data collected from January 5, 2016 to December 25, 2017 are utilized, and the missing values of less than 5% in the data are replaced by the adjacent position mean processing method. Based on the calculation using the Spearman correlation coefficient, it is determined that the water level monitoring hole W3, situated at the center of the landslide, exhibits the highest correlation (−0.576) with the GPS station M5. Consequently, it is considered the most suitable choice for representing the hydrological behavior of the landslide. Hence, this study will focus on the prediction of the groundwater depth of the monitoring hole W3.

In Figure 8, *T* = 0 shows the correlation coefficient between the rainfall of the day and the groundwater depth of the water level detection hole W3–W5 on the same day, *T* = 1 represents the correlation coefficient between the rainfall of the day and the groundwater depth of the second-day water level detection hole W3–W5, *T* = 2 represents the correlation coefficient between the rainfall of the previous day and the groundwater depth of the third-day water level detection hole W3–W5, and so on. The analysis demonstrates that as *T* increases from 1 to 12, the correlation coefficient initially rises and then declines. As *T* approaches the lag days, the correlation coefficient consistently increases. When *T* equals the lag days, the correlation coefficient reaches its peak, indicating the lag days between groundwater depth and rainfall. As *T* gradually exceeds the lag days, the influence of rainfall on groundwater depth diminishes, leading to a decreasing correlation coefficient. The highest correlation coefficient of W2–W5 corresponds to *T*, which is 5, 4, 3, and 2 days, respectively. This indicates that the groundwater depth fluctuations at W2–W5 lag behind rainfall by 5, 4, 3, and 2 days, respectively. It is also important to note that the correlation coefficient between rainfall and groundwater depth has been weakened due to the significant impact of engineering activities such as excavation and the construction of the drainage tunnel on the slope.

### Prediction and analysis

To alleviate the impact of lag effects on the prediction of groundwater depth of W3, considering the existing lag relationship of 4 days, the rainfall data for *T* = 0–4 are utilized as the input factors for the prediction model. Subsequently, the IAO-BiLSTM model is employed to forecast the groundwater depth in the landslide area. In the process of optimization, the IAO algorithm optimizes four hyperparameters that have a great influence on the accuracy of feature information recognition in BiLSTM, namely, the number of hidden units in the first layer, the number of hidden units in the second layer, training, times and learning rate. To assess the plausibility of the model, the gray wolf optimization algorithm (GWO) is used to optimize the BiLSTM network (Liu & Yang 2022) and an improved circle sparrow search algorithm to optimize kernel extreme learning machine (improved circle sparrow search algorithm (ICSSA)-KELM) (Parida *et al.* 2021), two-layer BP (Yesilnacar & Sahinkaya 2012), multiple linear regression with ridge regression (MLRM), and evolutionary polynomial regression (EPR) (Doglioni *et al.* 2012), GA to optimize BP neural network, and PSO-SVM (Zhu *et al.* 2020) for comparative prediction and analysis. It is important to note that all models used in this experimental test belong to the field of machine learning and are regression models. Their utilization in this predictive case is both permissible and appropriate. In addition, to avoid computational randomness, each model is set with a population of 30 and an iteration of 100, and their average results after running 50 times are used as the prediction results (Ahmadianfar *et al.* 2021).

Model . | Time lag . | RMSE . | MAE . | NSE . | Precision improvement (%) . |
---|---|---|---|---|---|

IAO-BiLSTM | Before | 0.26 | 0.23 | 0.99 | 3.16 |

After | 0.25 | 0.21 | 0.99 | ||

GWO-BiLSTM | Before | 5.11 | 4.81 | −3.04 | 34.30 |

After | 3.36 | 2.64 | −0.75 | ||

ICSSA-KELM | Before | 3.23 | 2.34 | −0.61 | 14.77 |

After | 2.75 | 2.15 | −0.17 | ||

GA-BP | Before | 2.87 | 2.29 | −0.28 | 2.40 |

After | 2.80 | 2.14 | −0.21 | ||

Two-layer BP | Before | 2.53 | 2.09 | −7.54 | 13.67 |

After | 2.19 | 1.74 | −1.51 | ||

PSO-SVM | Before | 2.98 | 2.35 | −0.38 | 5.53 |

After | 2.82 | 2.27 | −0.23 | ||

MLRM | Before | 2.91 | 2.50 | −0.31 | 12.10 |

After | 2.55 | 2.20 | −0.01 | ||

EPR | Before | 2.31 | 1.19 | 0.15 | 6.02 |

After | 2.17 | 1.12 | 0.27 |

Model . | Time lag . | RMSE . | MAE . | NSE . | Precision improvement (%) . |
---|---|---|---|---|---|

IAO-BiLSTM | Before | 0.26 | 0.23 | 0.99 | 3.16 |

After | 0.25 | 0.21 | 0.99 | ||

GWO-BiLSTM | Before | 5.11 | 4.81 | −3.04 | 34.30 |

After | 3.36 | 2.64 | −0.75 | ||

ICSSA-KELM | Before | 3.23 | 2.34 | −0.61 | 14.77 |

After | 2.75 | 2.15 | −0.17 | ||

GA-BP | Before | 2.87 | 2.29 | −0.28 | 2.40 |

After | 2.80 | 2.14 | −0.21 | ||

Two-layer BP | Before | 2.53 | 2.09 | −7.54 | 13.67 |

After | 2.19 | 1.74 | −1.51 | ||

PSO-SVM | Before | 2.98 | 2.35 | −0.38 | 5.53 |

After | 2.82 | 2.27 | −0.23 | ||

MLRM | Before | 2.91 | 2.50 | −0.31 | 12.10 |

After | 2.55 | 2.20 | −0.01 | ||

EPR | Before | 2.31 | 1.19 | 0.15 | 6.02 |

After | 2.17 | 1.12 | 0.27 |

Time lag . | Fitness value . | Hyperparameters . | |||
---|---|---|---|---|---|

The number of hidden units in the first layer . | The number of hidden units in the second layer . | Training times . | Learning rate . | ||

Before | 0.0932 | 100 | 29.6711 | 43.2858 | 0.0100 |

After | 0.0827 | 23.1126 | 33.3668 | 36.7120 | 0.0076 |

Time lag . | Fitness value . | Hyperparameters . | |||
---|---|---|---|---|---|

The number of hidden units in the first layer . | The number of hidden units in the second layer . | Training times . | Learning rate . | ||

Before | 0.0932 | 100 | 29.6711 | 43.2858 | 0.0100 |

After | 0.0827 | 23.1126 | 33.3668 | 36.7120 | 0.0076 |

## DISCUSSION

To mitigate computational randomness and validate the effectiveness of IAO-BiLSTM in predicting groundwater depth, as well as to assess the scientific validity of the time lag effect identification and resolution strategy, additional predictions are conducted for water level monitoring holes within the landslide area. However, W1's groundwater depth correlation with rainfall is weak due to its proximity to the drainage tunnel, thus making it impossible to accurately predict the groundwater depth via rainfall data. Therefore, W1 is excluded here. Incorporating the time lag relationships between groundwater depth and rainfall (Figure 8), the IAO-BiLSTM model was utilized to predict the groundwater depth of W2, W4, and W5. Specific prediction results are presented in Table 4. Overall, the prediction performance of W2, W4, and W5 is relatively good, with RMSE values of 0.63, 0.22, and 0.15, respectively, which meet the prediction requirements. This further confirms the feasibility and applicability of the IAO-BiLSTM prediction model.

Monitoring hole number . | Time lag . | RMSE . | MAE . | NSE . | Fitness value . | Precision improvement (%) . |
---|---|---|---|---|---|---|

W2 | Before | 0.65 | 0.54 | 0.84 | 0.0507 | 3.96 |

After | 0.63 | 0.52 | 0.86 | 0.0424 | ||

W4 | Before | 0.26 | 0.23 | 0.96 | 0.0195 | 15.38 |

After | 0.22 | 0.17 | 0.99 | 0.0182 | ||

W5 | Before | 0.18 | 0.15 | 0.97 | 0.0283 | 13.52 |

After | 0.15 | 0.13 | 0.99 | 0.0221 |

Monitoring hole number . | Time lag . | RMSE . | MAE . | NSE . | Fitness value . | Precision improvement (%) . |
---|---|---|---|---|---|---|

W2 | Before | 0.65 | 0.54 | 0.84 | 0.0507 | 3.96 |

After | 0.63 | 0.52 | 0.86 | 0.0424 | ||

W4 | Before | 0.26 | 0.23 | 0.96 | 0.0195 | 15.38 |

After | 0.22 | 0.17 | 0.99 | 0.0182 | ||

W5 | Before | 0.18 | 0.15 | 0.97 | 0.0283 | 13.52 |

After | 0.15 | 0.13 | 0.99 | 0.0221 |

## CONCLUSIONS

Using rainfall and groundwater depth monitoring data from a landslide site in Zhejiang Province, China, over 2 years, this study applies a wavelet multi-scale analysis method to investigate and identify the multi-scale characteristics of groundwater and finds the most important influencing factors for predicting groundwater depth. Subsequently, the time lag effect was taken into consideration, and the IAO-BiLSTM model was applied to predict groundwater depth in the landslide site. The main conclusions are as follows: (1) the groundwater depth of the water level monitoring hole (W2–W5) possesses a remarkable similarity in an energy resonating period and a phase relationship and demonstrates a strong correlation with rainfall, as indicated by a correlation coefficient exceeding 0.8. Rainfall is the most important influence and driving factor in predicting groundwater depth within landslide sites. (2) A negative correlation between rainfall and the groundwater depth is observed, with the level lagging behind the rainfall. The Spelman correlation coefficient is applied to measure the lag period of the W2–W5 water level monitoring holes, which were 5, 4, 3, and 2 days, respectively. (3) The construction of a drainage tunnel effectively reduces the groundwater depth and SDR and improves the stability of the slope. In addition, the closer the water level monitoring hole is to the drainage hole, the greater the weakening of the correlation and response between the water level and rainfall. (4) Considering the time lag effect and after comparing the results of GWO-BiLSTM, ICSSA-KELM, GA-BP, two-layer BP, PSO-SVM, MLRM, and EPR, it can be seen that IAO-BiLSTM used in this study has higher accuracy, and its RMSE, MAE, and *R*^{2} are 0.25, 0.21, and 0.99, respectively. It is proven that IAO-BiLSTM can effectively predict the groundwater level of landslides and is a reliable prediction model. This study provides a new approach to revealing the dynamic response relationship between groundwater levels and their influencing factors and offers a novel method for predicting the dynamic changes in groundwater levels of landslides.

## ACKNOWLEDGEMENTS

The authors would like to thank the project of the National Natural Science Foundation of China for its support as well as the editors and reviewers for their useful comments and suggestions.

## FUNDING

This study was supported by the National Natural Science Foundation of China (Key Project) (No. 42230702).

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories: https://doi.org/10.5061/dryad.djh9w0w6d.

## CONFLICT OF INTEREST

The authors declare there is no conflict.