Abstract
Identifying the influence of heavy precipitation and ecological water replenishment (EWR) on groundwater resources is essential for groundwater resources management and risk prevention. This study innovatively developed a groundwater resource analysis and prediction model integrated with the water level fluctuation method, correlation analysis, and machine learning method under the influence of heavy precipitation and EWR. Water level fluctuation method results showed that compared with January 1, 2021, the groundwater resources of the study area increased 4.46 × 108 m3 on August 28. Compared with small flow of EWR, heavy precipitation was the main contributor to the rise in the groundwater level. Correlation analysis found that elevation, specific yield, and permeability coefficient show positive correlations with groundwater resource recharge. Machine learning results showed that among the water level prediction models of 35 monitoring wells, extreme gradient boosting (XGB) and random forest (RF) performed best in 30 wells and five wells, respectively. The increase in groundwater storage predicted deviated from the actual value by only 0.6 × 107 m3 (prediction bias of 1.3%), indicating that the model prediction performance was good under the heavy precipitation condition. This study can help to better understand the change trend of groundwater resources under the conditions of heavy precipitation and EWR.
HIGHLIGHTS
Through hydrology, statistics, and machine learning, groundwater changes under the dual effects of heavy precipitation and ecological water replenishment are studied.
A machine learning model is developed to predict the groundwater level and storage under heavy precipitation scenarios.
XGB and RF models well predicted the groundwater change the next day, and the prediction deviation was only 1.3%.
INTRODUCTION
Groundwater refers to the water stored in rock voids below the ground (Xing et al. 2010), accounting for nearly 30% of the global freshwater reserves (Majumdar et al. 2020). Groundwater is an important source of water for drinking (supplying nearly 50% of the drinking water in the world) (Gleeson et al. 2019; Mohapatra et al. 2021), agricultural irrigation, and industrial production (Dangar et al. 2021; Agarwal et al. 2023). Changes in groundwater resources (water level and water quality changes) affect human life, industrial and agricultural production, and building safety (Schinke et al. 2012; Pelletier et al. 2015; Liu et al. 2019; Schreiner-McGraw et al. 2019; D. F. Wang et al. 2020). Among various factors, heavy precipitation, and ecological water replenishment (EWR) are important contributors to positive changes in groundwater resources. Specifically, heavy precipitation is induced by extreme climate, and EWR is the artificial transfer of water to water-deficient areas through water conservancy projects to replenish surface water or groundwater resources (Sun et al. 2023). The response of groundwater to heavy precipitation and EWR is nonlinear and complicated (Taylor et al. 2012; Yu & Lin 2015; Schreiner-McGraw et al. 2019; Pastore et al. 2020). The recharge of groundwater in different regions is closely related to geological structure, geographic location, and aquifer properties (van Roosmalen et al. 2007; Goodarzi et al. 2015; Fagbohun 2018; Ren et al. 2018; J. C. Wang et al. 2020). Against the background of heavy precipitation and EWR, management of groundwater resources and risk prevention measures are difficult to formulate.
Driven by the extreme climate of the North China Plain, frequent heavy precipitation events occurred in Beijing from June to August 2021. Meanwhile, the Yongding River in the territory received EWR from Guanting Reservoir. Existing studies have used numerical simulation and machine learning to analyze the impact of common precipitation and EWR (focusing on the impact of EWR) on groundwater (e.g., water level and water quality) in the Yongding River Basin (Beijing Section) (Zhang et al. 2018; Hu et al. 2019; Luo et al. 2019; Ji et al. 2021; Sun et al. 2021; Xu et al. 2022), providing references for understanding the groundwater change mechanism under human intervention. However, there is a relative lack of research on the response mechanism and changing trend of groundwater resources under the rare condition that heavy precipitation and EWR occur at the same time. Therefore, exploring the response mechanism and changing trend of groundwater under the dual influence of heavy precipitation and EWR using highly applicable, accurate, and efficient methods has high practical significance.
The changes in groundwater storage reflect the replenishment of groundwater (Li et al. 2015), which is a quantitative indicator of the response of groundwater to external conditions. In the water level fluctuation method, the change in groundwater storage is calculated according to the groundwater level variation (Healy & Cook 2002). This method has the advantages of simple practical application, fast calculation, and few constraints (Delottier et al. 2018; Gong et al. 2020; Z. Y. Zhang et al. 2020), and is widely used in calculating changes in groundwater storage (Crosbie et al. 2019; Yenehun et al. 2020). In this study, the calculation results of the water level fluctuation method were extrapolated by the Kriging interpolation analysis tool of ArcGIS, and changes in the groundwater storage of the entire study area were then obtained.
Heavy precipitation and EWR are the recharge sources of groundwater, but the recharge of groundwater may show differences across different regions, which is related to many factors (e.g., ground elevation, fault zone distribution, and aquifer nature). Correlation analysis can provide a basis for exploring the reasons for the differences in groundwater replenishment. Specifically, Spearman correlation analysis is favorable because it has no restrictions on the distribution characteristics of the analyzed variables (Artusi et al. 2002; Zhu et al. 2023). This study used this method to analyze the relationships between complex natural factors (e.g., ground elevation, fault zone distribution, aquifer properties) and groundwater recharge. This analysis has practical significance for understanding the differences in groundwater recharge in different regions under heavy precipitation and EWR.
The numerical model (e.g., MODFLOW) is a commonly used groundwater level prediction model (Kayhomayoon et al. 2022; Lv et al. 2022), but it requires a large amount of accurate data, which can never be determined with absolute accuracy (e.g., the physical properties of the aquifer) (Chen et al. 2020). In addition, the establishment of the numerical model takes a lot of time and money (Jiang et al. 2022; Mohammed et al. 2023). In recent years, the machine learning model has shown good performance in groundwater level prediction (Guzman et al. 2019; Wei et al. 2019; Mohapatra et al. 2021; Adnan et al. 2023; Zhou et al. 2023). Unlike in the numerical model, the complex physical relationship between variables need not be considered in the machine learning model (Jing et al. 2023). Moreover, the machine learning model is advantageous in terms of modeling time, prediction accuracy, etc. (Chen et al. 2020; Natarajan & Sudheer 2020; Müller et al. 2021). Artificial neural network (ANN), random forest (RF), and extreme gradient boosting (XGB) have been successfully applied to groundwater level prediction under various conditions around the world (Majumdar et al. 2020; Yadav et al. 2020; Osman et al. 2021). Sun et al. (2016) used an ANN model to predict the groundwater level of a swamp forest in Singapore and achieved higher accuracy for a one-day-ahead prediction than for a seven-day-ahead prediction. X. H. Wang et al. (2018) used an improved RF model to predict the groundwater level of the water source of the Dagu River in Qingdao, and achieved R2 of 0.9581. Chen et al. (2020) compared the groundwater level prediction accuracy of numerical simulation and machine learning models and found that the machine learning model has a higher prediction performance than the numerical simulation model; moreover, the machine learning model afforded simpler and faster practical application. El Bilali et al. (2021) compared the prediction performance of support vector regression (SVR), k-nearest neighbors algorithm (KNN), RF, and ANN on the groundwater level of seven groundwater monitoring wells in a semi-arid area (Province of Khemisset, Morocco). The results showed that except for KNN, other models showed good prediction effects. So far, the machine learning model has been rarely applied to model groundwater resources under heavy precipitation and EWR.
This study integrated the water level fluctuation method, correlation analysis, and machine learning method to establish a groundwater response prediction model considering actual conditions of heavy precipitation and EWR. The objectives of this study were to (i) study the changes in the groundwater level and groundwater storage in Yongding River Basin (Beijing Section) under extreme conditions, (ii) discuss the reasons for the differences in groundwater recharge in different regions, and (iii) predict the groundwater level and groundwater storage for the next day under the condition of heavy precipitation. This study has a practical reference value for understanding groundwater recharge and change under heavy precipitation and EWR and has guiding significance for the advanced management and risk prevention of groundwater resources.
MATERIALS AND METHODOLOGY
Description of the study area
On January 1, 2021, the Beijing Municipal Government launched the Yongding River spring EWR plan, which transfers water from the Guanting Reservoir in the northwest of Beijing to the Yongding River. From January 1 to August 27, 2021, the average flow of EWR was approximately 4.687 m3/s, the replenishment time was 239 days, and the total amount of replenishment was approximately 9.68 × 107 m3.
Data collection
The data collected in this study include daily water level data from 35 groundwater monitoring wells (Nos G1–G35) in the plain area of the Yongding River Basin (Beijing Section) from June 1 to August 27, 2021, monitoring well surface elevation, specific yield data, precipitation data over Beijing, and EWR data. The groundwater level data and monitoring well ground elevation data were obtained from the China National Groundwater Monitoring Project (http://jcgc.cigem.cn/wechat/, visited in August 2021). Aquifer-specific yield data were extracted from the literature (Yu et al. 2017; Hu et al. 2019). The precipitation and EWR flow data were obtained from the Beijing Water Affairs Bureau (http://nsbd.swj.beijing.gov.cn:8088/uacp/pageview/bjsw/main). The distances between the monitoring wells and the Yongding River fault zone and the Babaoshan fault zone were measured using the ranging function of ArcGIS (version 10.2).
Methods
Numerical methods
Water level fluctuation method
According to Equation (2), the change value (Q) of groundwater storage in the plain area of the Yongding River Basin (Beijing Section) from June 1 to August 27, 2021, was calculated. The number of grids in the study area was 59,675, and the size of a single grid was 200 m × 200 m. The specific yield distribution is shown in Supplementary Material, Figure S2. In this study, the ‘Spatial Analyst’ tool in ArcGIS (version 10.2) was used to implement ordinary Kriging interpolation.
Spearman correlation analysis
The replenishment of groundwater resources in different regions may be different, which is attributable to differences in natural and anthropogenic factors. Natural factors mainly include the ground elevation (X1), distance from the monitoring well to the Yongding River fault zone (X2), distance from the monitoring well to the Babaoshan fault zone (X3), precipitation (X4), specific yield (X5), and permeability coefficient (X6). Anthropogenic factors are mainly groundwater exploitation and utilization (Xu et al. 2022). In this study, the data of groundwater exploitation were excluded, and only the influence of natural factors on the replenishment of groundwater resources was explored. Since the above variables do not have normal distribution characteristics, the Pearson correlation method is not applicable (Hazra & Gogtay 2016). The Spearman correlation method does not require variable distribution characteristics. Therefore, Spearman correlation analysis was conducted between natural factors, groundwater level changes (Y1), and groundwater storage change (Y2).
Machine learning methods
Construction of the prediction model
When predicting the water level, a large prediction step leads to a decrease in prediction accuracy (Sun et al. 2016). Therefore, the prediction step of this study is 1, that is, the groundwater level for the next day is predicted. The water level data, rainfall data, and EWR flow data for 88 days (from June 1 to August 27, 2021) were used to establish the prediction model of 35 groundwater monitoring wells, and predict the groundwater level for the next day in the study area (i.e. the output variable of the model) under the heavy rainfall and EWR conditions. The ‘train_test_split’ function in Python (version No. 3.8.3) was used to divide the dataset into two sub-datasets, including a training set with 70% data and a test set with 30% data.
First, the input variables of the machine learning model were determined. The correlation between the input variable and the output variable is very important for prediction accuracy (Huang et al. 2017; Amaranto et al. 2019; Wu et al. 2021). The input variables of the model were set by referring to the results of the Spearman correlation analysis; natural factors having a strong correlation with the groundwater level were selected as input variables. As the historical groundwater level significantly affects the prediction accuracy of the next-day groundwater level (Alizamir et al. 2018), the groundwater level data of day t and day t − 1 were also selected as input variables.
Subsequently, a machine learning model was established, and the accuracy of its prediction was assessed. Numerous studies have shown that the RF and XGB models based on decision trees have better performance in predicting groundwater levels than ANN, SVM, ANFIS, and other models (Miro et al. 2021; Osman et al. 2021; Ruidas et al. 2021; Pham et al. 2022; Rao et al. 2022). Therefore, this study selected RF and XGB as the groundwater level prediction models. After selecting the input variables, the prediction model between the input variable and the output variable was established through the RF algorithm and XGB algorithm. In this study, the RF model and the XGB model were established for each monitoring well (35 in total), and the model with the highest performance was selected as the groundwater level prediction model for each monitoring well after comparison. In order to assess the predictive ability of the model clearly and comprehensively, the 35 monitoring wells were grouped under four categories according to the similarity of data changes through systematic cluster analysis. One representative monitoring well was selected from each category to evaluate the model prediction accuracy in detail, and the model was then extended to all monitoring wells. The machine learning models used in this study were all built with Python (version No. 3.8.3).
Additionally, the model predicted value was used to draw the change chart of groundwater storage on August 27. The water level fluctuation method was used to convert the groundwater level data predicted by the model into the change in groundwater storage. A graph of changes in the groundwater storage of the study area on August 27 relative to January 1 was mapped. The charts of the predicted and actual groundwater storage were then compared to further evaluate the model prediction accuracy.
Random forest
RF is a machine learning algorithm based on decision trees, which can be applied to classification and regression problems (Breiman 2001). RF comprises multiple decision trees, and no relationship exists between each decision tree in the forest. The final output of the model is jointly determined by each decision tree in the forest. When dealing with classification problems, each decision tree in the forest gives the final category, and the category of the test sample is determined by voting; when dealing with regression problems, the mean output of each decision tree is used as the final result (Breiman 2001). The performance of the RF model is closely related to parameters such as ntree (quantity of decision trees), mtry (number of variables selected on each decision tree node), and max depth (maximum number of divisions of decision tree nodes) (Biau & Scornet 2016; Brédy et al. 2020; Rahman et al. 2020). In this study, the ntree values of the RF model were set as 10, 20, 50, 100, and 200, mtry values were set as 1, 2, and 3, and max depth values were set as 3, 5, 7, and 9. The optimal parameter value was selected by the RF model algorithm after automatic calculation. The modeling steps are described in Section 3.3.1.
Extreme gradient boosting
XGB is one of the best-performing machine learning algorithms (for both classification and regression problems), with higher computation speed than other algorithms (Osman et al. 2021). As a powerful prediction tool, XGB is a representative algorithm of the lifting method (Boosting) in ensemble algorithms. Through the ensemble algorithm, XGB models can build multiple decision trees, and then summarize the modeling results of all decision trees to obtain better regression than the single-tree model (Fan et al. 2021). Unlike RF, the XGB model accumulates new tree models on the basis of the single-tree model and gradually integrates many trees to form a model with strong predictive performance. Moreover, the XGB model can prevent overfitting of the model through regularization, and the sum of the weighted contributions of all decision tree predictions is provided as the final output (Zanotti et al. 2019; Park & Kim 2021). In this study, the ntree values of the XGB model were set as 1,000 and 2,000 and the remaining parameters (e.g., max depth) were automatically selected by XGB. The modeling steps are described in Section 3.3.1.
Assessment of model performance
RESULTS AND DISCUSSION
Analysis of groundwater resource changes
Analysis of the groundwater level change
Before frequent heavy precipitation
With reference to January 1, the overall study area was in a state of decline on June 30 (Figure 3(a)), except for the groundwater level in the central part of the study area, where the groundwater level rebounded slightly. The groundwater level increased by 0.04–1.86 m, with an average of 0.84 m, in Haidian District, Xicheng District, Chaoyang District, Fengtai District, and Daxing District, representing 11 monitoring wells. The area of groundwater level rise covered approximately 419 km2, accounting for 17.5% of the total area of the study area; the water level declined by 0.01–4.49 m, with an average of 1.20 m, in Changping District, Haidian District, Mentougou District, Shijingshan District, Chaoyang District, Fengtai District, and Daxing District, representing 24 monitoring wells. The area of the groundwater level decline covered approximately 1,971 km2, accounting for 82.5% of the total area of the study area.
From January 1 to June 30 (low precipitation in six months, about 140 mm in total), the flow of EWR was approximately 4.687 m3/s, and the amount of water replenishment was approximately 7.33 × 107 m3. However, only 17.5% of the study area showed a slight increase in the groundwater level (Figure 3(a)), which indicates that the continuous small-flow water replenishment method has a relatively insignificant contribution to the recovery of the groundwater level. The small recovery of the groundwater level in the central part of the study area may be related to the small amount of precipitation (about 76.9 mm) in Beijing during June 1–30. Groundwater mining for agricultural irrigation in spring may be the reason for the decline of groundwater levels in the study area (Xu et al. 2022). The plain area of the Yongding River Basin (Beijing Section) includes a 14 km2 area of winter wheat plantation, which is often irrigated from May to June (L. M. Wang et al. 2018).
Frequent heavy precipitation period
With reference to January 1, the groundwater level showed relative increases in most parts of the study area on July 31 (Figure 3(b)). The water level increased by 0.04–4.33 m (average of 1.29 m) in Changping District, Haidian District, Mentougou District, Shijingshan District, Xicheng District, Chaoyang District, Fengtai District, Fangshan District, and Daxing District, representing 25 monitoring wells. The area of groundwater level rise covered 2,362 km2, accounting for 98.8% of the total area of the study area (2,390 km2); the water level declined (mainly concentrated in Daxing District) by 0.02–2.11 m (average of 0.75 m) in Haidian District, Chaoyang District, and Daxing District, representing ten monitoring wells. The area of groundwater level decline covered approximately 28 km2, accounting for 1.2% of the total area of the study area. Comparing the changes in groundwater levels on June 30 and July 31, the groundwater level sharply rose during the frequent heavy precipitation period, and the groundwater recharge was prominent.
After frequent heavy precipitation
With reference to January 1, the overall groundwater level of the study area increased by 0.09–6.19 m (average of 1.92 m) on August 27 (Figure 3(c)). The area of the groundwater level rise covered approximately 2,390 km2, accounting for 100% of the total area (2,390 km2). On average, the groundwater level increased by 0.45 m between July 31 and June 30 and by 0.63 m between August 27 and July 31. The comparison results show that the groundwater level exhibited lower recovery during the frequent heavy precipitation period (July) than after the frequent heavy precipitation period (August), and the response of the groundwater level to the frequent heavy precipitation has a hysteresis effect (Qi et al. 2018). The increase in the groundwater level of the study area was the largest in the northwest and gradually decreased from northwest to southeast (Figure 3(b) and 3(c)). The existence of the Babaoshan fault zone in the northwest (Figure 1) and the difference in regional aquifer permeability may be the reasons for this phenomenon. The precipitation recharges the groundwater in the plain area through the Babaoshan fault zone (Wang et al. 2010), and monitoring wells (recharge point) farther from the fault zone receive less replenishment. From the northwest to the southeast, the aquifer sediments in the study area change from coarse to fine, and the permeability decreases. Therefore, the groundwater recharged by precipitation is more prominent in the northwest than in the southeast. In addition, the karst groundwater in the mountainous area preferentially recharges the piedmont groundwater in the northwest, which is also an important reason for the difference in the increase in the groundwater level of the study area.
Analysis of groundwater resource storage
Changes in groundwater storage were more distinct in Changping District, Haidian District, Shijingshan District, Mentougou District, Fengtai District, and Xicheng District than in other districts (Figure 4). These six districts cover an area of approximately 804 km2, accounting for only 33.6% of the total study area (2,390 km2), but the increase in groundwater resources amounted to 2.92 × 108 m3, accounting for 65.5% of the total increase (4.46 × 108 m3). These six districts, which constitute the ‘western suburban groundwater reservoir’ in the plain area of Beijing (Li et al. 2010), are located in the upper part of the Yongding River alluvial fan, close to the Babaoshan fault zone, and have good, water-rich aquifers (consisting of single and double layers of sand and pebbles). Under the action of frequent heavy precipitation, they are more likely to store a large amount of groundwater resources.
The increases in groundwater storage in Dongcheng District, Chaoyang District, Fangshan District, Daxing District, and Tongzhou District amounted to 1.54 × 108 m3, accounting for 34.5% of the total increase (4.46 × 108 m3), and the increases were mainly concentrated in Daxing District (7.7 × 107 m3) and Chaoyang District (5.4 × 107 m3). Daxing District has a relatively small area of impervious roads, which is conducive to groundwater recharge by precipitation and surface runoff infiltration. Chaoyang District has a high level of urbanization, and anti-seepage infrastructure such as buildings and roads reduce the recharge efficiency of precipitation and surface runoff (Han et al. 2017; Chen et al. 2018; L. L. Zhang et al. 2020). However, districts close to the water-rich area on the west side (Haidian District, Xicheng District, and Dongcheng District) received lateral recharge from the groundwater on the west side.
Factors influencing the groundwater resource recharge
The distance between the recharge point and the Yongding River fault zone (X2) showed no distinct correlation with the groundwater level change (Y1) and groundwater storage change (Y2). Before heavy precipitation in 2021 (January to June) the flow of EWR was 4.687 m3/s. The EWR mode with continuous small flow did not significantly promote the recovery of the groundwater level on both sides of the Yongding River fault zone (Figure 3(a)). During the heavy precipitation period from June to August 2021, the groundwater level in the study area rebounded greatly, but the groundwater recharge was not significantly related to the distance between the recharge point and the Yongding River fault zone (Figures 3(b) and 3(c)). Under heavy precipitation, precipitation recharges the groundwater in the plain area through the Babaoshan fault zone, while the Yongding River fault zone may have mainly acted as a flood discharge channel. Therefore, the phenomenon of precipitation recharging groundwater in the plain area through the Yongding River fault zone was not distinct.
The distance between the recharge point and the Babaoshan fault zone (X3) was significantly negatively correlated with the groundwater level change (Y1) and groundwater storage change (Y2). The Babaoshan fault zone is distributed in the northwest of the study area, and its strike is from northeast to southwest (Figure 1). During heavy precipitation, the fault zone can hold precipitation and surface runoff, recharging the groundwater on both sides of the fault zone (Wang et al. 2010). Moreover, aquifers on both sides of the fault zone are rich in water, and groundwater replenishment is prominent. With the increasing distance between the recharge point and the Babaoshan fault zone, groundwater recharge becomes relatively insignificant.
Precipitation (X4) was significantly positively correlated with the groundwater level change (Y1). The heavy precipitation from June to August 2021 is the key factor in the groundwater level rise. Precipitation not only recharges the Quaternary groundwater in the plain through runoff infiltration but also through the Babaoshan fault zone in the northwest of the region (Gudmundsson 2000; Apaydin 2010). However, no significant correlation was found between precipitation (X4) and groundwater storage change (Y2). As shown in Equation (2), the change in the water level and the specific yield jointly determine the change in groundwater storage. Various increases in groundwater levels and different specific yields (Supplementary Material, Figure S2 and Figure 3) of the aquifer may lead to changes in groundwater resource storage in areas with high precipitation that is less or greater than that in areas with a low precipitation. Consequently, the calculated change in groundwater storage did not show any significant correlation with precipitation.
Specific yield (X5) showed a significant positive correlation with the groundwater storage change (Y2). Areas with large aquifer-specific yields showed relatively large changes in groundwater storage (Supplementary Material, Figure S2 and Figure 4). A large specific yield indicates that the aquifer particles are coarse and uniform in size, which is beneficial to the storage of groundwater resources (Healy & Cook 2002). No significant correlation was found between specific yield (X5) and the groundwater level change (Y1). Precipitation is the main source of groundwater recharge in the study area. In this study, the amount of precipitation corresponding to a large specific yield is uncertain, resulting in the lack of a significant correlation between specific yield and groundwater level changes.
The aquifer permeability coefficient (X6) showed a significant positive correlation with the groundwater level change (Y1) and groundwater storage change (Y2). In the study area, aquifer sediments gradually become finer from northwest to southeast, implying lower permeability. Accordingly, the infiltration of precipitation and surface runoff decreased gradually. During precipitation, aquifers with higher permeability will generate preferential flow, further promoting recharge (Hu et al. 2017).
In summary, heavy precipitation from June to August served as the main source of groundwater recharge in the study area during the study period. Ground elevation, aquifer-specific yield, and permeability coefficient showed a significant positive correlation with the recharge of groundwater resources. Heavy precipitation recharges the groundwater in the plain area through the Babaoshan fault zone. The distance from the recharge point to the Babaoshan fault zone is significantly negatively correlated with the groundwater recharge.
Analysis of groundwater resource prediction under heavy precipitation
In this study, the RF and XGB models were established for each monitoring well (35 in total) under the heavy precipitation from June to August, and the model with the best performance was selected as the groundwater level prediction model for each monitoring well. Referring to the results of the correlation analysis (Figure 5), precipitation, which was strongly correlated with the groundwater level, was selected as one of the input variables. EWR was excluded because its contribution to the recovery of the groundwater level was relatively insignificant (Section 3.1.1). Therefore, this study assumes that precipitation was the most important factor causing changes in groundwater levels, and the impact of the surface water system (Yongding River) on groundwater was negligible. The input variable of each model was selected in combinations of precipitation, t − 1 day groundwater level, and t day groundwater level. The results of systematic clustering analysis are shown in Supplementary Material, Table S2: 35 monitoring wells grouped into I, II, III, and IV categories according to the changes in water levels, and G9, G16, G20, and G32, respectively, were selected as representative monitoring wells.
Evaluation of the groundwater level model
Well . | Model . | Input variables . | Ratio of test set to training set . | R2 . | RMSE . |
---|---|---|---|---|---|
G9 | XGB | Rain, GWL (t − 1), GWL (t) | 3:7 | 0.993 | 0.037 |
G16 | XGB | Rain, GWL (t − 1), GWL (t) | 3:7 | 0.851 | 0.117 |
G20 | RF | Rain, GWL (t) | 3:7 | 0.724 | 0.105 |
G32 | XGB | Rain, GWL (t) | 3:7 | 0.977 | 0.028 |
Well . | Model . | Input variables . | Ratio of test set to training set . | R2 . | RMSE . |
---|---|---|---|---|---|
G9 | XGB | Rain, GWL (t − 1), GWL (t) | 3:7 | 0.993 | 0.037 |
G16 | XGB | Rain, GWL (t − 1), GWL (t) | 3:7 | 0.851 | 0.117 |
G20 | RF | Rain, GWL (t) | 3:7 | 0.724 | 0.105 |
G32 | XGB | Rain, GWL (t) | 3:7 | 0.977 | 0.028 |
Note: R2 and RMSE values represent the model's performance on the overall dataset.
The predicted t + 1 day groundwater levels of the 35 monitoring wells are shown in Supplementary Material, Table S3. Overall, the R2 of 30 models was greater than 0.9, and the R2 of 33 models was greater than 0.8. The machine learning model (XGB and RF) developed in this study has high groundwater-level prediction accuracy under heavy precipitation conditions. Among the water level prediction models of the 35 monitoring wells, XGB showed the best performance in 30 wells (G1–G3, G5–G9, G13–G19, G21–G35), and RF showed the best performance in five wells (G4, G10–G12, G20). The XGB model showed higher prediction performance than the RF model. The R2 value of the groundwater level model for G3, G16, G17, G20, and G27 remained below 0.9 at 0.873, 0.851, 0.759, 0.724, and 0.818, respectively. There are a few abnormal values or cliff-type abnormal values in the groundwater level data of these five monitoring wells. The low R2 of some models is due to the water level data outliers, which may be attributable to the failure of water level monitoring instruments (Gribovszki et al. 2013).
Prediction of groundwater storage
CONCLUSIONS
In this study, a coupling model integrated with the water level fluctuation method, correlation analysis, and machine learning method was first proposed to analyze the changing trend of groundwater under the influence of heavy precipitation and EWR. The results of the groundwater level analysis showed that the EWR with the continuous small flow (4.687 m3/s) did not significantly contribute to the recovery of the groundwater level in the study area, whereas heavy precipitation significantly contributed to the recovery of the groundwater level. The calculation results of the water level fluctuation method showed that the groundwater storage increased by 4.46 × 108 m3 after heavy precipitation. Correlation analysis found that ground elevation, aquifer-specific yield, and the permeability coefficient showed a significant positive correlation with groundwater resource replenishment. The distance from the recharge point to the Babaoshan fault zone after heavy precipitation was significantly negatively correlated with the recharge of groundwater resources, indicating that the fault zone may serve as a migration channel for precipitation. The performance assessment results of the machine learning models showed that XGB had better predictive performance than RF in most cases. The groundwater resource prediction model developed based on XGB and RF had a prediction error of 0.6 × 107 m3 (prediction bias of 1.3%), and the model performance is relatively good.
The more groundwater monitoring wells that are involved in Kriging interpolation, the more real the regional groundwater changes will be. Therefore, we recommend that the local government increase the number of groundwater monitoring wells in areas with large changes in the groundwater level and reserves to grasp more accurate groundwater changes in extreme conditions (e.g., heavy precipitation and EWR conditions). In addition, it is also recommended to use the groundwater level data with a longer time-range to establish a groundwater prediction model based on XGB and RF, so as to obtain the changing trend in groundwater in extreme cases in advance and to make timely management decisions. This study can help to better understand the changing trend in groundwater resources under the influence of heavy precipitation and ecological replenishment and has guiding significance for the advanced management and risk prevention of groundwater resources.
ACKNOWLEDGEMENTS
This work was supported by the Beijing Municipal Science and Technology Commission Project (Z191100006919001) and the Fundamental Research Business Special Project of the Central Public Welfare Scientific Research Institutes (JY-2013YQ06072101), and the National Key R&D Plan of the In Situ Real Time Online Monitoring Technology and Equipment for Typical Organic Pollutants in Groundwater (SQ2022YFC3700182). The authors are grateful for the valuable comments and suggestions given by the editors and the anonymous reviewers.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.