The early warning and prediction of saltwater intrusion are crucial for the protection and management of estuarine and marine ecosystems, and water supply safety. Aiming at providing a high-accuracy and stable salinity prediction, this study proposes an integrated deep learning method based on long short-term memory (LSTM) networks, gated recurrent units (GRUs), and convolutional neural networks (CNN). Taking the Modaomen Waterway as the research area, an hourly saltwater intrusion prediction model is constructed with a prediction period of 6, 12, and 24 h. Based on upstream flow data, downstream tide data, and antecedent salinity data from three monitor stations during 2020–2022, the saltwater intrusion prediction model is trained and validated. Results show that the proposed model can provide satisfactory results in all stations and prediction periods. Through the comparisons among the four models, it demonstrates that the integrated model performs better in saltwater intrusion prediction, achieving peak Nash–Sutcliffe efficiency improvements of 65.4% and error reductions up to 54.9%. As the prediction period extends, the accuracy of the predictions decreases. By enhancing the precision and reliability of salinity forecasts, this research aids in the development of effective mitigation strategies to counteract the adverse effects of saltwater intrusion.

  • Proposes an integrated deep learning model for salinity prediction.

  • Demonstrates high accuracy and stability with limited dataset.

  • Adapts to various stations, aiding in water resource management.

  • Provides reliable predictions to support water management decisions.

In the past decade, the phenomenon of saltwater intrusion during the dry season in the Pearl River Delta has become increasingly severe (Hu et al. 2024a). Saltwater intrusion is the process by which saline water from the sea encroaches into freshwater estuaries and rivers, primarily due to the reduction in freshwater inflows (Hoitink & Jay 2016). This escalating issue has led to significant implications for industrial layout, domestic water consumption, and agricultural irrigation in coastal areas (Tran et al. 2024). The intrusion of saltwater into freshwater systems not only disrupts the availability and quality of water resources but also poses a serious challenge to the sustainable development of regional economies (Hu et al. 2024b). Consequently, saltwater intrusion disasters have emerged as a critical factor restricting the development of water resources and impeding economic sustainability in affected regions. Effective simulation and accurate prediction of saltwater intrusion processes are key technical issues for saltwater intrusion prevention and water safety assurance in coastal estuarine areas (Tang et al. 2020; Tong et al. 2024).

Saltwater intrusion prediction models can be broadly classified into two main categories (Weng et al. 2024): numerical hydrodynamic models (Ji et al. 2007; Veerapaga et al. 2019; Binh et al. 2020) and data-driven models (Rohmer & Brisset 2017; Lu et al. 2021; Yin et al. 2022). Each type of model offers distinct advantages and faces unique challenges, making them suitable for different applications and scenarios. Numerical hydrodynamic models have long been utilized for the prediction of saltwater intrusion due to their ability to simulate complex physical processes governing estuarine dynamics. Since 1960s, one-dimensional models (Krvavica et al. 2017; Martínez-Aranda et al. 2020) have been proposed for saltwater intrusion modeling. With the continuous rising demand of model accuracy, the one-dimensional model is gradually replaced by the two-dimensional model (Abarca & Clement 2009; Yan et al. 2024) and the three-dimensional model (Banks et al. 2024; Kihm et al. 2024). These models, which are based on the principles of fluid mechanics and hydrodynamics, such as Navier–Stokes equations, can provide detailed spatial and temporal distributions of salinity. The primary advantage of numerical hydrodynamic models lies in their robustness and accuracy in representing physical phenomena, given sufficient input data and calibration. They allow for the incorporation of various factors such as tidal movements, river discharges, and meteorological conditions. However, these models require detailed input data, such as riverbed topography data and meteorological data, which may not always be available. Furthermore, one-dimensional hydraulic models struggle to represent the three-dimensional nature of the salt intrusion processes, while two- and three-dimensional models are too computationally demanding to run on operational timescales (Wullems et al. 2023). Additionally, the process of model calibration and validation is time-consuming and requires significant expertise, which can limit their practical application in regions with limited data availability and technical capacity (Wang & Ge 2025).

Due to these limitations of numerical hydrodynamic models, data-driven models (Yin et al. 2022, 2024; Deleersnyder et al. 2024) have gained popularity in recent years as an alternative approach, primarily relying on machine learning techniques such as tree-based methods and kernel algorithms, with deep learning architectures remaining comparatively underutilized except for isolated long short-term memory (LSTM) networks applications. Tran et al. (2022) tested the performances of five algorithms (simple linear, K-nearest neighbors, random forest, support vector machine, and LSTM) for predicting saltwater intrusion in the Vietnamese Mekong Delta. Nguyen et al. (2021) establish a novel framework for monitoring salinity intrusion using remote sensing and machine learning. Leveraging advancements in machine learning and data science, these models rely on historical data to identify patterns and make predictions about future salinity levels (Zhou et al. 2020; Lal & Datta 2021). In general, the historical data utilized in these models is not only the salinity data, but also the salinity related various (Tian et al. 2024), including river flow, tide, wind, and temperature. The primary advantage of data-driven models is their ability to process large datasets and provide rapid predictions without the need for detailed physical input parameters. Once trained, data-driven models have been reported to be successful in capturing non-linear systems, and have a runtime of milliseconds to seconds per time step (Hauswirth et al. 2021). They are often more flexible and can be updated easily with new data, making them suitable for real-time monitoring and forecasting (Weng et al. 2024). However, the accuracy of these models heavily depends on the quality and quantity of available data, and they may not perform well in scenarios with limited historical records. Furthermore, previous studies (He et al. 2019; Sun et al. 2020) generally focused on daily predictions of saltwater intrusion, which do not provide the detailed information necessary for effective water resource management. Some researchers (Xu et al. 2024; Li et al. 2025) have used data-driven model to predict the maximum salinity value of the next day, but the data are daily scale and the time span is short, which may ignore the dynamic changes of higher frequency and the trend of longer period. Some researchers also used LSTM, gated recurrent units (GRUs), convolutional neural networks (CNNs) and other models to study the rule of saltwater intrusion at Modaomen Estuary, and found that runoff had the greatest impact (Tian et al. 2024), but saltwater intrusion was affected by the half-daily cycle of the tide, and the daily data may smooth out the diurnal fluctuations driven by the tide, which is not suitable for hour-level early warning. The performance of different deep learning methods also varies significantly. Therefore, it is crucial to develop and utilize algorithms that can deliver stable and reliable prediction results. Ensuring stability in simulations is essential not only for accurate predictions but also for real-time decision-making and risk management in dynamic and complex environments.

Regarding the above defects, this study aims to contribute to the field by providing a robust and efficient tool for predicting saltwater intrusion, which can support water safety management and decision-making in coastal areas. The objectives of this paper are threefold: (1) to develop an hourly saltwater intrusion prediction model with a deep learning method and limited dataset; (2) to propose an innovative deep learning algorithm to provide stable predictions; and (3) to analyze the model's performance over different forecast periods. The remainder of this paper is organized as follows: Section 2 provides an overview of the study area and details the methodology for constructing the deep learning model. Section 3 presents the results of the model's application. Section 4 offers a discussion of the findings. Finally, Section 5 concludes the paper.

Study area and available data

The Pearl River network is primarily composed of the Xijiang, Dongjiang, and Beijiang Rivers, which flow into the South China Sea through eight tributaries: Yamen, Hutiaomen, Jitimen, Modaomen, Hengmen, Hongqimen, Jiaomen, and Humen waterways. Of these, the Modaomen waterway (Figure 1) discharges the largest volume, accounting for nearly 28.3% of the total water from the Pearl River. This waterway is crucial for supplying fresh water to the cities of Jiangmen, Zhongshan, Zhuhai, and Macao. In recent years, however, the region has been plagued by frequent and severe saltwater intrusion, particularly during dry seasons, due to global climate change and extensive human activities. Consequently, this study focuses on the Modaomen waterway, developing a saltwater intrusion prediction model based on an integrated deep learning model to aid in water resource management and emergency decision-making.
Figure 1

Distribution map of monitoring stations within the study area.

Figure 1

Distribution map of monitoring stations within the study area.

Close modal
Figure 2

Process of model integration.

Figure 2

Process of model integration.

Close modal

As the crucial source for supplying fresh water to the surrounding cities, lots of water intakes are distributed along both sides of the Modaomen waterway. Due to its decisive role in freshwater supply, salinity data are monitored at each water intake. In this research, salinity data from Guangchang station (GC), Pinggang station (PG), and Zhuzhoutou (ZZT) station is collected. Previous literature indicates that saltwater intrusion is influenced by the coupling effects of external sea tides, upstream flows, and estuary morphology, which serve as indicators for saltwater intrusion prediction. The flows at the upstream Shijiao station (SJ) and Gaoyao station are highly correlated with the Modaomen waterway, making them suitable indicators for upstream flow. Additionally, sea tide data from Denglongshan station is collected. All data spans from October 2020 to June 2022, with a time step of one hour. In this study, 70% of the dataset was used for training and the remaining 30% for validation. The locations of all stations are depicted in Figure 1.

Deep learning model construction

The main framework of the deep learning model is designed to predict the salinity at water intakes several hours into the future by utilizing relevant factors. The input to the model includes upstream river flow data, downstream tide data, and salinity monitoring data from three stations in the Modaomen Waterway collected between 2020 and 2022. The output is the predicted salinity data for the three water intakes during the prediction period. Following this framework, several deep learning models are utilized, and an integrated deep learning model is proposed in Figure 2.

LSTM model

LSTM networks (Gers et al. 2000) are a special kind of recurrent neural network (RNN), which are widely used in water-related prediction (Deng et al. 2022; Yin et al. 2023). They use gate units to control the logic of data updates or discards, overcoming the drawbacks of RNNs such as excessive influence of weights and the tendency for gradient vanishing or explosion. LSTM networks can converge better and faster, effectively improving prediction accuracy. LSTM has three gates: forget gate, input gate, and output gate, determining the information to be remembered or forgotten at each time step. The input gate decides how much new information is added to the cell, the forget gate controls whether the information at each time step will be forgotten, and the output gate decides whether there is any information output at each time step. The saltwater intrusion prediction model can fully utilize its memory units to capture and learn long-term dependencies in sequence data. In this study, the LSTM model is defined as part of a Sequential model. The input layer is an LSTM layer with 100 neurons and uses the ReLU activation function. A fully connected layer is added after the LSTM layer, with the number of output features the same as the number of input features.

CNN model

CNN is a type of feedforward neural network mainly used for processing images and sequence data (Afrin et al. 2024; Lin & Wang 2024). They consist of convolutional layers and subsampling layers, significantly reducing the parameters needed to train the neural network. They can sample images and use the principle of local correlation to reduce the data volume while retaining useful information, effectively capturing some significant peaks. In this study, the construction method of the CNN model is similar to that of the LSTM model, also being part of a sequential model. The input layer is a one-dimensional convolutional layer (Conv1D) with 64 filters and a kernel size of 2, using the ReLU activation function. After the Conv1D layer, a pooling layer (MaxPooling1D) is added to reduce the dimensionality of the features, then the features are flattened into a one-dimensional vector. Finally, a fully connected layer is added.

GRU model

GRU is another type of RNN (Zhang et al. 2021; Yuan & Chen 2022), similar to LSTM, used for processing time series data. GRU has an input gate and a forget gate but no output gate, making it more efficient at handling long-term dependencies and avoiding overfitting more easily than LSTM, with faster computation speed. In this study, the construction method of the GRU model is similar to that of the LSTM model, also being part of a sequential model. The input layer is a GRU layer with 100 neurons and uses the ReLU activation function. A fully connected layer is added after the GRU layer.

Model integration

The goal of model integration is to enhance prediction performance by integrating the outputs of multiple independently trained models. This study uses the concatenate function in Keras to achieve model integration, as it effectively connects model outputs along the specified axis, fully utilizing the unique information of each model. First, an input layer is created to receive data, and the output of the input layer is connected to the input layers of the LSTM, GRU, and CNN models, respectively. Then, the outputs of these three models are concatenated using the concatenate layer, integrating the features extracted by each model. After connecting the model outputs, a dense layer with a number of features equal to the output time steps is added, resulting in the final stacked output. This indicates that the prediction at each time step is a vector determined by the number of features, and the repetition of this vector across the entire time series equals the output time steps. The dense layer uses the ReLU activation function. The model class in the Keras library is used to construct the integrated model. The initially created input layer serves as the input for the integrated model, and the stacked output obtained from the above steps serves as the output for the model. The hyperparameters of the LSTM, GRU, and CNN models remain unchanged. Finally, the integrated model is compiled using a weighted loss function to achieve the best prediction results.

LSTM–GRU–CNN model optimization and evaluation

This study utilizes LSTM–GRU–CNN integrated models to train the training set, capturing time series relationships and patterns in the data. Loss is an important indicator during the training process, which measures the error of a model in prediction tasks, with smaller loss indicating better fitting of the model to the data. Training loss is a performance metric for the model on training data, measuring the gap between the model's predictions and actual targets. Testing loss is a performance metric on an independent test dataset, used to assess the model's generalization capability. Generally, the closer both training and testing losses are to 0, the better the model's performance. When both training and testing losses are close to 0, it indicates that the model performs well on both training and test data, exhibiting high accuracy and reliability. For the ensemble model, a loss function and the Adam optimization algorithm are used to optimize the model parameters. Mean squared logarithmic error (MSLE) performs the training and validating loss function during the training process. Its calculation metric is presented in the following equation:
(1)

While n is the training sample size; represents the actual values; represents the predicted values.

Besides the MSLE, various metrics (such as Nash–Sutcliffe efficiency (NSE) coefficient, root mean squared error, accuracy, precision, and recall) are calculated to evaluate the model's performance during the validating process.

NSE coefficient
The Nash efficiency coefficient is a widely used method for evaluating the performance of hydrological models. As shown in Equation (2), NSE ranges from (−∞, 1), with values closer to 1 indicating better model prediction performance.
(2)

While is the observed value at time i; is the model predicted value at time i; is the average of all observed values; n represents the total number of observations.

Root mean square error
The root mean square error (RMSE) calculates the average of the squared differences between predicted values and actual values. RMSE ranges from (0, +∞), with values closer to 0 indicating better model prediction performance, meaning that the difference between predicted values and actual observations is smaller. Compared to mean squared error, RMSE is more representative of the average deviation from the true data.
(3)

While n is the number of training samples; is the measured true values; are the measured and predicted values.

Accuracy
Accuracy refers to the ratio of correctly predicted samples to the total number of predicted samples. It does not consider whether the predicted samples are positive or negative but reflects the overall performance of the model algorithm. The range of accuracy is (0, 1), with values closer to 1 indicating better model prediction performance.
(4)

While TP is the number of correctly predicted positive samples; TN represents the number of correctly predicted negative samples; FP is the number of false positives; FN is the number of false negatives.

Precision
Precision refers to the ratio of correctly predicted positive samples to the total number of samples predicted as positive. In other words, it measures how many of the predicted positive samples are truly positive. Precision focuses solely on positive samples. For instance, with a salinity threshold of 250, samples with salinity greater than 250 are considered positive. Precision ranges from (0, 1), with values closer to 1 indicating better model prediction performance.
(5)
Recall
Recall refers to the ratio of correctly predicted positive samples to the total number of actual positive samples. In other words, it measures how many of the actual positive samples are correctly identified by the model. Recall ranges from (0, 1), with values closer to 1 indicating better model performance in identifying positive samples.
(6)

Based on the evaluation results, the hyperparameters of the model are tuned. This includes adjusting model structure, learning rate, batch size, and other hyperparameters. To maintain high accuracy, the hyperparameter range is determined to keep accuracy above 0.90, ensuring that the model performs well across various conditions. The hyperparameter tuning scale is shown in Table 1.

Table 1

Hyperparameter tuning

Hyperparameter typesRanges
Epochs [30, 100] 
Batch_size [50, 200] 
CNN filters [33, 200] 
LSTM neurons (25, 256) 
GRU neurons (25, 256) 
L2 regularization parameter (0.001, 2.000) 
Hyperparameter typesRanges
Epochs [30, 100] 
Batch_size [50, 200] 
CNN filters [33, 200] 
LSTM neurons (25, 256) 
GRU neurons (25, 256) 
L2 regularization parameter (0.001, 2.000) 

In the current research, models with different prediction periods are established and trained. The prediction period refers to the range of the target variable or output of the model, such as the 6-h lead time, 12-h lead time, and 24-h lead time. The following results focus on the prediction effects of the model with prediction periods of 6, 12, and 24 h.

Training and validation results

The original dataset was divided into training and validation sets, comprising 60 and 40% of the data, respectively. Loss results were calculated for both sets in each epoch. The evolution of loss results serves as a fundamental indicator of the model's performance and reliability. Models with different prediction periods resulted in varying training and validation losses. Figure 3 illustrates the training and validation losses for prediction periods of 6, 12, and 24 h during the training and validation processes. The integrated model's loss was observed to gradually decrease and stabilize throughout the process. The convergence of loss across all prediction periods indicates the model's ability to adapt to the training data and perform effectively in prediction tasks.
Figure 3

Training and validation losses of the integrated model at (a) GC_6 h, (b) PG_6 h, (c) ZZT_6 h, (d) GC_12 h, (e) PG_12 h, (f) ZZT_12 h, (g) GC_24 h, (h) PG_24 h, and (i) ZZT_24h.

Figure 3

Training and validation losses of the integrated model at (a) GC_6 h, (b) PG_6 h, (c) ZZT_6 h, (d) GC_12 h, (e) PG_12 h, (f) ZZT_12 h, (g) GC_24 h, (h) PG_24 h, and (i) ZZT_24h.

Close modal

For further analysis of model accuracy, RMSE and NSE are listed in Table 2. As shown in Table 2, RMSE and NSE indicate that the model performs well with the 6-h prediction period, while the prediction performance decreases for the 12- and 24-h periods.

Table 2

Integrated model prediction results at 6, 12, and 24-h periods

StationRMSE (mg/L)
NSE
6 h12 h24 h6 h12 h24 h
GC 605.14 639.88 625.33 0.91 0.90 0.90 
PG 108.63 117.08 141.20 0.91 0.91 0.86 
ZZT 51.56 59.91 73.49 0.94 0.92 0.88 
StationRMSE (mg/L)
NSE
6 h12 h24 h6 h12 h24 h
GC 605.14 639.88 625.33 0.91 0.90 0.90 
PG 108.63 117.08 141.20 0.91 0.91 0.86 
ZZT 51.56 59.91 73.49 0.94 0.92 0.88 

Comparison of integrated model results with different prediction periods

To illustrate the model's performance in predicting different salinity levels, prediction results at three stations are depicted in Figure 4. This figure compares the observed and predicted salinity values at the GC, PG, and ZZT stations. The closer the predicted values are to the observed values, the better the prediction accuracy. The x-axis represents the model's prediction results for relatively low, median, and high values in the dataset, while the y-axis represents the salinity values. It can be observed that the prediction accuracy for the 6-h prediction period is superior to that of the 12 and 24-h prediction periods, with accuracy decreasing as the prediction period increases. The prediction accuracy across different sites is similar.
Figure 4

Comparison of observed and predicted values at different pumping stations under various prediction periods: (a) GC_6 h, (b) PG_6 h, (c) ZZT_6 h, (d) GC_12 h, (e) PG_12 h, (f) ZZT_12 h, (g) GC_24 h, (h) PG_24 h, and (i) ZZT_24h.

Figure 4

Comparison of observed and predicted values at different pumping stations under various prediction periods: (a) GC_6 h, (b) PG_6 h, (c) ZZT_6 h, (d) GC_12 h, (e) PG_12 h, (f) ZZT_12 h, (g) GC_24 h, (h) PG_24 h, and (i) ZZT_24h.

Close modal

Prediction of saltwater intrusion events

Predicting salinity values is crucial for the current model. However, prediction of the occurrence of saltwater intrusion events is equally important for risk prevention. A saltwater intrusion event is determined by whether the salinity at the estuary exceeds 250. Using the observed and predicted values, the number of instances where salinity at the three stations exceeded 250 was counted, and the predicted number of saltwater intrusion events was compared to the observed number. Based on the prediction results, precision, accuracy, and recall are calculated and shown in Table 3. The closer the precision, accuracy, and recall are to 1, the better the prediction performance. As shown in Table 3, the model's prediction results for salinity events are relatively high, especially for the 6-hour prediction period, with precision values of 0.96, 0.97, and 0.98 for the three stations. The prediction accuracy remains consistent as the prediction period increases.

Table 3

Comparison of observed and predicted frequency of salt intrusion at the three stations

StationPrecision
Accuracy
Recall
6 h12 h24 h6 h12 h24 h6 h12 h24 h
GC 0.92 0.93 0.90 0.96 0.93 0.91 0.93 0.94 0.92 
PG 0.89 0.86 0.85 0.97 0.96 0.95 0.93 0.92 0.86 
ZZT 0.94 0.90 0.91 0.98 0.98 0.98 0.95 0.94 0.94 
StationPrecision
Accuracy
Recall
6 h12 h24 h6 h12 h24 h6 h12 h24 h
GC 0.92 0.93 0.90 0.96 0.93 0.91 0.93 0.94 0.92 
PG 0.89 0.86 0.85 0.97 0.96 0.95 0.93 0.92 0.86 
ZZT 0.94 0.90 0.91 0.98 0.98 0.98 0.95 0.94 0.94 

Performance of the LSTM, GRU, and CNN model

To illustrate the performance improvement, results from LSTM, GRU, and CNN, are also proposed in Table 4. Compared with Tables 2 and 3, it can be found that the integrated model demonstrates superior performance in salinity prediction across almost all metrics and periods compared to the individual LSTM, GRU, and CNN models.

Table 4

Prediction performance of different models

ModelLSTM
GRU
CNN
StationGCPGZZTGCPGZZTGCPGZZT
Precision 6h 0.73 0.8 0.78 0.58 0.85 0.87 0.79 0.92 0.82 
12h 0.62 0.75 0.8 0.54 0.77 0.81 0.91 0.86 0.9 
24h 0.61 0.78 0.96 0.63 0.68 0.82 0.86 0.87 0.9 
Accuracy 6h 0.81 0.96 0.96 0.74 0.9 0.98 0.86 0.96 0.97 
12h 0.68 0.94 0.97 0.7 0.54 0.98 0.92 0.96 0.98 
24h 0.67 0.92 0.97 0.83 0.83 0.97 0.91 0.96 0.98 
Recall 6h 0.96 0.75 0.94 0.88 0.97 0.9 0.85 
12h 0.96 0.77 0.92 0.78 0.94 0.9 0.87 
24h 0.72 0.65 0.99 0.91 0.83 0.97 0.89 0.81 
NSE 6h 0.76 0.87 0.71 0.79 0.88 0.77 0.84 0.9 0.79 
12h 0.74 0.83 0.67 0.81 0.85 0.71 0.89 0.91 0.92 
24h 0.76 0.52 0.58 0.86 0.81 0.73 0.87 0.9 0.84 
RMSE (mg/L) 6h 738.44 128.63 114.36 922.42 117.14 102.72 740.88 111.63 96.67 
12h 797.63 151.86 122.63 752.86 129.16 122.3 675.1 99.36 72.2 
24h 753.61 254.54 153.18 692.02 152.07 113.55 693.33 114.09 87.16 
ModelLSTM
GRU
CNN
StationGCPGZZTGCPGZZTGCPGZZT
Precision 6h 0.73 0.8 0.78 0.58 0.85 0.87 0.79 0.92 0.82 
12h 0.62 0.75 0.8 0.54 0.77 0.81 0.91 0.86 0.9 
24h 0.61 0.78 0.96 0.63 0.68 0.82 0.86 0.87 0.9 
Accuracy 6h 0.81 0.96 0.96 0.74 0.9 0.98 0.86 0.96 0.97 
12h 0.68 0.94 0.97 0.7 0.54 0.98 0.92 0.96 0.98 
24h 0.67 0.92 0.97 0.83 0.83 0.97 0.91 0.96 0.98 
Recall 6h 0.96 0.75 0.94 0.88 0.97 0.9 0.85 
12h 0.96 0.77 0.92 0.78 0.94 0.9 0.87 
24h 0.72 0.65 0.99 0.91 0.83 0.97 0.89 0.81 
NSE 6h 0.76 0.87 0.71 0.79 0.88 0.77 0.84 0.9 0.79 
12h 0.74 0.83 0.67 0.81 0.85 0.71 0.89 0.91 0.92 
24h 0.76 0.52 0.58 0.86 0.81 0.73 0.87 0.9 0.84 
RMSE (mg/L) 6h 738.44 128.63 114.36 922.42 117.14 102.72 740.88 111.63 96.67 
12h 797.63 151.86 122.63 752.86 129.16 122.3 675.1 99.36 72.2 
24h 753.61 254.54 153.18 692.02 152.07 113.55 693.33 114.09 87.16 

To emphasize the capacity of the proposed model, the ratios of improvement are depicted in the heatmap (Figure 5), which quantitatively compares the proposed model with LSTM, GRU, and CNN baselines across five evaluation metrics: precision, accuracy, recall, NSE, and RMSE. The color intensity directly corresponds to the percentage change in metric values relative to each baseline model, enabling systematic visual analysis of performance differences. The integrated model systematically outperformed individual models across all three pumping stations, with performance improvements except for the Recall. The inherent tradeoff between precision and recall, often influenced by classification thresholds, makes it challenging for models to excel in both metrics simultaneously. At GC, compared to LSTM, the integrated model increased NSE by 18, 21, and 21% for 6-, 12-, and 24-h forecasts, respectively, while reducing RMSE by 18.1, 19.8, and 17.0%. Against GRU, it achieved a 34.4% error reduction in 6-h forecasts, though 24-h NSE improvements were smaller. For CNN, NSE gains remained below 9%. At PG, the integrated model enhanced the 24-h NSE of LSTM by 65.4% (from baseline 0.52) with a 44.5% error reduction, outperformed GRU in 12-h forecasts (7.1% NSE increase, 9.3% error decrease), and consistently reduced errors compared to CNN. At ZZT, it improved the 12-h NSE of GRU by 35.6%, reduced RMSE of LSTM by 54.9% (6-h) and 52.0% (24-h), and maintained over 46% error reduction against CNN despite smaller NSE gains, conclusively demonstrating its superior precision and stability across diverse operational scenarios.
Figure 5

Heatmap for comparison between integrated model and baseline models.

Figure 5

Heatmap for comparison between integrated model and baseline models.

Close modal

By integrating the temporal memory of LSTM networks, the dynamic gating mechanisms of GRU, and the local feature extraction capabilities of CNN, the integrated model achieves prediction advantages tailored to the operational characteristics of different pumping stations. Compared to standalone LSTM architectures, the most significant optimization was observed, with an average NSE improvement of 25.2%, coupled with short-to-medium-term error reductions exceeding 18% at GC Station and 65% at PG Station. For GRU implementations, enhancements predominantly focused on short-to-medium horizons, exemplified by a 51% error reduction in 12-h forecasts. Regarding CNN, the integrated model exhibited limited NSE improvements (<20%) at most stations except PG, primarily due to CNN's inherently higher baseline accuracy. Comparative analysis consistently shows that the optimization effect of the integrated model on LSTM and GRU (performance gain range: 15–65%) is better than that on CNN. While station-specific data variations influenced improvement magnitudes across metrics, all results validate the integrated model's ability to enhance prediction accuracy through synergistic architectural complementarity, thereby confirming its practical utility in operational prediction tasks.

The integrated model demonstrates superior performance in medium-to-long-term forecasting (12–24 h). Specifically, it achieves 65% higher accuracy than LSTM for the 24-h forecast at PG station and 29% greater improvement over GRU for the 12-h forecast at ZZT station, showing significant advantages compared to standalone models. However, the improvement was limited (generally <20%) in the short-time (6-h) forecast. Different models show obvious differences at different sites: LSTM has the largest improvement space, GRU has the most unstable performance at the medium time scale, and CNN is relatively stable. Overall, the integrated model consistently achieves the highest scores in precision, accuracy, recall, and NSE across all three sites and periods, highlighting its robustness and effectiveness in predicting salinity levels compared to the individual LSTM, GRU, and CNN models.

To further demonstrate the stability of various models in predicting salinity, the performance variations at three different stations over various prediction periods are illustrated using a boxplot, as shown in Figure 6. In terms of precision, accuracy, and the NSE of the predictions, the proposed model shows significant advantages, with the smallest range between the maximum and minimum values, the highest minimum value, and the highest median. For recall, the proposed model performs slightly worse, ranking second only to the CNN model. Overall, the boxplot indicates that the proposed model consistently delivers stable and satisfactory prediction results across all stations and prediction periods.
Figure 6

Boxplot for model performances in salinity predictions.

Figure 6

Boxplot for model performances in salinity predictions.

Close modal

Taking the Modaomen Waterway as the research area, this study proposes an integrated deep-learning model for salinity prediction according to the basic algorithms of LSTM, CNN, and GRU. Upstream flow data, downstream tide data, and salinity data from the GC, PG, and ZZT stations during 2020–2022 are utilized as model input for model training and validation. The model performance is evaluated with indicators such as RMSE, NSE, precision, accuracy, and recall. Results show the performance of the LSTM–GRU–CNN integrated model with different prediction periods. And the comparison among the integrated model, LSTM, GRU, and CNN is also analyzed. However, there are still some problems needed to be discussed.

Salinity prediction with deep learning model

In the domain of saltwater intrusion prediction with a deep learning model, many studies incorporate a wide range of input factors to enhance prediction accuracy. Tian et al. (2024) employed LSTM and CNN to predict the severity of saltwater intrusion with runoff, maximum tidal range, and wind as input variables. Weng et al. (2024) chose various environmental components, including antecedent chlorinity, upstream discharge, tidal level, and wind vector, to drive the clustering method for salinity prediction. More input factors can provide more learning features for the deep learning model, but they also input noise interference. Therefore, the more inputs, the better the model effect is not necessarily the case. Our research demonstrates that by selecting a more focused subset of input variables – specifically upstream runoff, downstream tide, and antecedent salinity – we can still achieve robust prediction results. Our results are also consistent with Tian et al. (2024) that the largest contributor to saltwater intrusion was runoff (40%), followed by maximum tidal range, wind speed, and wind direction, contributing 25, 20, and 15%, respectively. In other words, limited data with upstream runoff, downstream tide, and antecedent salinity can also produce acceptable prediction results.

Besides the proper selection of input variables, the proposed integrated deep learning model effectively captures the essential dynamics of saltwater intrusion, showcasing its capability to perform well despite the reduced input complexity. In the current research, the precision of the model with 24 h-ahead ranges from 0.85 to 0.91, while Weng et al. (2024) is 0.75 with their proposed model. From their results, the performance of Extreme Gradient Boosting and time-series K-means is also tested. Combined with the results of LSTM, GRU, and CNN in the current research, it can be concluded that a single or traditional algorithm is hard to produce satisfactory results for salinity prediction. The proposed model in this research not only simplifies the data acquisition process but also reduces computational costs and enhances the model's efficiency. The success of the proposed model with limited data inputs underscores the potential for deep learning techniques to extract meaningful patterns and relationships from smaller datasets. By leveraging the inherent feature extraction capabilities of deep learning, the proposed model can identify and learn from the critical interactions among the selected variables. This finding is particularly significant for regions where extensive environmental monitoring data are scarce or challenging to obtain.

Tradeoff between prediction period and reliability

The prediction period, or the forecast horizon, is a critical factor in the performance of the predictive model. Our integrated deep-learning model demonstrates that as the prediction period extends, the accuracy of the predictions diminishes. This phenomenon is consistent with the inherent challenge of long-term forecasting, where uncertainties accumulate over time, and the model's ability to capture and extrapolate underlying patterns diminishes. A clear tradeoff between the length of the prediction period and the accuracy of the predictions is illustrated. For shorter prediction periods (e.g., up to 6 h), the model maintains high accuracy, with RMSE of 51.56–604.14 mg/L, NSE of 0.91–0.94, and accuracy of 0.96–0.98. As the prediction period extends to 24 h, the model's performance exhibits a decline trend with RMSE of 73.49–625.33 mg/L, NSE of 0.86–0.90 and accuracy of 0.91–0.98. Although models with prediction periods exceeding 24 h are not included in the current study, previous research (Tian et al. 2024) shows that the RMSE doubles and accuracy declines sharply when the prediction period increases from 24 to 48 h.

In practical applications, the reliability of predictions is paramount. For decision-making purposes, especially in managing water resources and mitigating the impacts of saltwater intrusion, stakeholders must weigh the tradeoffs between prediction period and accuracy. Short-term predictions, while more accurate, may offer limited foresight, whereas longer-term predictions, despite lower accuracy, provide a broader temporal outlook. To address this, it is crucial to implement a robust validation framework that continuously assesses model performance and updates predictions based on the latest available data. Additionally, integrating ensemble forecasting techniques, where multiple models with varying strengths are combined, can enhance the overall reliability of predictions.

Furthermore, additional research should focus on refining the model to improve its long-term prediction capabilities. This could involve: (1) Incorporating real-time data and leveraging advanced data assimilation techniques to continuously update the model. (2) Combining deep learning models with traditional physical models to better capture the underlying physical processes governing saltwater intrusion. (3) Developing methods to quantify and communicate the uncertainty in predictions, providing stakeholders with a clearer understanding of the confidence levels associated with different prediction periods.

This study proposes an ensemble of deep learning models – LSTM, CNN, and GRU – for predicting saltwater intrusion. The proposed model is used for salinity prediction at the GC, PG, and ZZT stations in the Modaomen waterway. The performance of the proposed model is analyzed and discussed, leading to the following main conclusions:

By using upstream flow, downstream tide, and antecedent salinity as input data, an hourly saltwater intrusion prediction model is developed using deep learning methods. This model provides salinity predictions for the next 6, 12, and 24 h. An integrated deep-learning model based on LSTM, GRU, and CNN is proposed. Comparisons among the four models demonstrate that the integrated model performs better in saltwater intrusion prediction, particularly in terms of precision, accuracy, NSE, and stability. The integrated model has the most outstanding performance in the medium and long-term forecast (12–24 h). It achieves 65% higher accuracy than LSTM for the 24-h forecast at PG station and 29% greater improvement over GRU for the 12-h forecast at ZZT station, showing significant advantages compared to standalone models. However, the improvement was limited (generally <20%) in the short-time (6-h) forecast. Different models show obvious differences at different sites: LSTM has the largest improvement space, GRU has the most unstable performance at the medium time scale, and CNN is relatively stable. The prediction period is a critical factor in saltwater intrusion prediction. There is a clear tradeoff between the length of the prediction period and the accuracy of the predictions. As the prediction period extends, the accuracy of the predictions decreases.

This research was funded by the National Natural Science Foundation of China (52109018, 12202150).

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abarca
E.
&
Clement
T. P.
(
2009
)
A novel approach for characterizing the mixing zone of a saltwater wedge
,
Geophysical Research Letters
,
36
,
L06402
.
doi:10.1029/2008GL036995
.
Banks
E. W.
,
Noorduijn
S.
,
Post
V. E. A.
,
Munday
T.
,
Sorensen
C.
,
Cahill
K.
,
Jolly
P.
,
Ellis
J.
,
Werner
A. D.
&
Batelaan
O.
(
2024
)
Island hydrogeology in the tropics: constraining a 3D variable-density groundwater flow and solute transport model with geophysics
,
Journal of Hydrology
,
635
,
131037
.
doi:10.1016/j.jhydrol.2024.131037
.
Binh
D. V.
,
Kantoush
S. A.
,
Saber
M.
,
Mai
N. P.
,
Maskey
S.
,
Phong
D. T.
&
Sumi
T.
(
2020
)
Long-term alterations of flow regimes of the Mekong River and adaptation strategies for the Vietnamese Mekong Delta
,
Journal of Hydrology-Regional Studies
,
32
,
100742
.
Deleersnyder
W.
,
Dudal
D.
&
Hermans
T.
(
2024
)
A multidimensional AI-trained correction to the 1D approximate model for Airborne TDEM sensing
,
Computers & Geosciences
,
188
,
105602
.
doi:10.1016/j.cageo.2024.105602
.
Deng
H. Q.
,
Chen
W. J.
&
Huang
G. R.
(
2022
)
Deep insight into daily runoff forecasting based on a CNN-LSTM model
,
Natural Hazards
,
113
(
3
),
1675
1696
.
doi:10.1007/s11069-022-05363-2
.
Gers
F. A.
,
Schmidhuber
J.
&
Cummins
F.
(
2000
)
Learning to Forget: continual Prediction with LSTM
,
Neural Computation
,
12
(
10
),
2451
2471
.
doi:10.1162/089976600300015015
.
Hauswirth
S. M.
,
Bierkens
M. F. P.
,
Beijk
V.
&
Wanders
N.
(
2021
)
The potential of data driven approaches for quantifying hydrological extremes
,
Advances in Water Resources
,
155
,
104017
.
doi:10.1016/j.advwatres.2021.104017
.
He
Y. H.
,
Chen
S.
,
Huang
R. Z.
,
Chen
X. H.
&
Cong
P. T.
(
2019
)
Impact of upstream runoff and tidal level on the chlorinity of an estuary in a river network: a case study of Modaomen estuary in the Pearl River Delta, China
,
Journal of Hydroinformatics
,
21
(
2
),
359
370
.
doi:10.2166/hydro.2018.210
.
Hoitink
A. J. F.
&
Jay
D. A.
(
2016
)
Tidal river dynamics: implications for deltas
,
Reviews of Geophysics
,
54
(
1
),
240
272
.
doi:10.1002/2015RG000507
.
Hu
H. J.
,
Chen
G. D.
,
Lin
R.
,
Huang
X.
,
Wei
Z. D.
&
Chen
G. H.
(
2024a
)
An observation study of the combined river discharge and sea level impact on the duration of saltwater intrusion in Pearl River estuary-Modaomen waterway
,
Natural Hazards
,
120
(
1
),
409
428
.
doi:10.1007/s11069-023-06146-z
.
Hu
S. K.
,
Deng
Z. H.
,
Liu
B. J.
,
Hu
M. C.
,
Xu
B. Y.
&
Yu
X.
(
2024b
)
Impact of tidal dynamics and typhoon-induced inundation on saltwater intrusion in coastal farms
,
Science of The Total Environment
,
915
,
170109
.
Ji
Z. G.
,
Hu
G. D.
,
Shen
J. A.
&
Wan
Y. S.
(
2007
)
Three-dimensional modeling of hydrodynamic processes in the St. Lucie Estuary
,
Estuarine Coastal and Shelf Science
,
73
(
1–2
),
188
200
.
doi:10.1016/j.ecss.2006.12.016
.
Krvavica
N.
,
Kozar
I.
,
Travas
V.
&
Ozanic
N.
(
2017
)
Numerical modelling of two-layer shallow water flow in microtidal salt-wedge estuaries: finite volume solver and field validation
,
Journal of Hydrology and Hydromechanics
,
65
(
1
),
49
59
.
doi:10.1515/johh-2016-0039
.
Lin
P. H.
&
Wang
N. Y.
(
2024
)
A data-driven approach for regional-scale fine-resolution disaster impact prediction under tropical cyclones
,
Natural Hazards
,
120
(
8
),
7461
7479
.
doi:10.1007/s11069-024-06527-y
.
Lu
P. Y.
,
Lin
K. R.
,
Xu
C. Y.
,
Lan
T.
,
Liu
Z. Y.
&
He
Y. H.
(
2021
)
An integrated framework of input determination for ensemble forecasts of monthly estuarine saltwater intrusion
,
Journal of Hydrology
,
598
,
126225
.
Martínez-Aranda
S.
,
Ramos-Pérez
A.
&
García-Navarro
P.
(
2020
)
A 1D shallow-flow model for two-layer flows based on FORCE scheme with wet-dry treatment
,
Journal of Hydroinformatics
,
22
(
5
),
1015
1037
.
doi:10.2166/hydro.2020.002
.
Nguyen
T. G.
,
Tran
N. A.
,
Vu
P. L.
,
Nguyen
Q.-H.
,
Nguyen
H. D.
&
Bui
Q.-T.
(
2021
)
Salinity intrusion prediction using remote sensing and machine learning in data-limited regions: a case study in Vietnam's Mekong Delta
,
Geoderma Regional
,
27
,
e00424
.
doi:10.1016/j.geodrs.2021.e00424
.
Rohmer
J.
&
Brisset
N.
(
2017
)
Short-term forecasting of saltwater occurrence at La Comte River (French Guiana) using a kernel-based support vector machine
,
Environmental Earth Sciences
,
76
(
6
),
1
16
.
doi:10.1007/s12665-017-6553-5
.
Sun
Z. H.
,
Fan
J. W.
,
Yan
X.
&
Xie
C. S.
(
2020
)
Analysis of critical river discharge for saltwater intrusion control in the upper South Branch of the Yangtze River Estuary
,
Journal of Geographical Sciences
,
30
(
5
),
823
842
.
doi:10.1007/s11442-020-1757-0
.
Tang
G. P.
,
Yang
M. Z.
,
Chen
X. H.
,
Jiang
T.
,
Chen
T.
,
Chen
X. H.
&
Fang
H.
(
2020
)
A new idea for predicting and managing seawater intrusion in coastal channels of the Pearl River, China
,
Journal of Hydrology
,
590
,
125454
.
Tian
Q.
,
Gao
H.
,
Tian
Y.
,
Wang
Q.
,
Guo
L.
&
Chai
Q.
(
2024
)
Attribution analysis and forecast of salinity intrusion in the Modaomen estuary of the Pearl River Delta
,
Frontiers in Marine Science
,
11
,
1407690
.
Tran
T. T.
,
Pham
N. H.
,
Pham
Q. B.
,
Pham
T. L.
,
Ngo
X. Q.
,
Nguyen
D. L.
,
Nguyen
P. N.
&
Veettil
B. K.
(
2022
)
Performances of different machine learning algorithms for predicting saltwater intrusion in the Vietnamese Mekong delta using limited input data: a study from Ham Luong River
,
Water Resources
,
49
(
3
),
391
401
.
doi:10.1134/S0097807822030198
.
Veerapaga
N.
,
Azhikodan
G.
,
Shintani
T.
,
Iwamoto
N.
&
Yokoyama
K.
(
2019
)
A three-dimensional environmental hydrodynamic model, fantom-refined: validation and application for saltwater intrusion in a meso-macrotidal estuary
,
Ocean Modelling
,
141
,
101425
.
doi:10.1016/j.ocemod.2019.101425
.
Weng
P.
,
Tian
Y.
,
Zhou
H.
,
Zheng
Y.
&
Jiang
Y.
(
2024
)
Saltwater intrusion early warning in Pearl river Delta based on the temporal clustering method
,
Journal of Environmental Management
,
349
,
119443
.
doi:10.1016/j.jenvman.2023.119443
.
Wullems
B. J. M.
,
Brauer
C. C.
,
Baart
F.
&
Weerts
A. H.
(
2023
)
Forecasting estuarine salt intrusion in the Rhine-Meuse delta using an LSTM model
,
Hydrology and Earth System Sciences
,
27
(
20
),
3823
3850
.
doi:10.5194/hess-27-3823-2023
.
Xu
Y.
,
Lin
K.
,
Hu
C.
,
Chen
X.
,
Zhang
J.
,
Xiao
M.
&
Xu
C. Y.
(
2024
)
Uncovering the dynamic drivers of floods through interpretable deep learning
,
Earth's Future
,
12
(
10
),
e2024EF004751
.
Yan
M.
,
Solorzano-Rivas
S. C.
,
Werner
A. D.
&
Lu
C. H.
(
2024
)
Analytical estimation of sea-level rise impacts on the freshwater lenses of elliptical islands with sloping shorelines
,
Journal of Hydrology
,
629
,
13051
.
Yin
J. N.
,
Tsai
F. T. C.
&
Lu
C. H.
(
2022
)
Bi-objective extraction-injection optimization modeling for saltwater intrusion control considering surrogate model uncertainty
,
Water Resources Management
,
36
(
15
),
6017
6042
.
doi:10.1007/s11269-022-03340-9
.
Yin
H. C.
,
Wu
Q.
,
Yin
S. X.
,
Dong
S. N.
,
Dai
Z. X.
&
Soltanian
M. R.
(
2023
)
Predicting mine water inrush accidents based on water level anomalies of borehole groups using long short-term memory and isolation forest
,
Journal of Hydrology
,
616
,
128813
.
Yuan
R.
&
Chen
J.
(
2022
)
A hybrid deep learning method for landslide susceptibility analysis with the application of InSAR data
,
Natural Hazards
,
114
(
2
),
1393
1426
.
doi:10.1007/s11069-022-05430-8
.
Zhang
Y. G.
,
Tang
J.
,
He
Z. Y.
,
Tan
J. K.
&
Li
C.
(
2021
)
A novel displacement prediction method using gated recurrent unit model with time series analysis in the Erdaohe landslide
,
Natural Hazards
,
105
(
1
),
783
813
.
doi:10.1007/s11069-020-04337-6
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC 4.0), which permits copying, adaptation and redistribution for non-commercial purposes, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc/4.0/).