Spur dikes are pivotal elements in river training, serving to mitigate the dynamic alterations induced by river degradation and aggradation. Traditionally, scour prediction models have relied on regression techniques, but the advent of soft computing and machine learning has offered opportunities for enhanced accuracy. This study focuses on the development of hybrid machine-learning models, including eXtreme Gradient Boosting (XGBoost), random forest (RF), convolutional neural network–long short-term memory, and artificial neural network, optimized using genetic algorithms to predict both temporal scour depth variation and maximum scour depth around the initial spur dike in a series. The analysis reveals strong associations between scour depth and various parameters such as non-dimensional time, spacing, channel width, time-averaged velocity, and densimetric Froude number. The models are established through an iterative process involving four predictor combinations. Results demonstrate XGBoost as the top-performing model, consistently exhibiting superior performance with R2 of 0.99, root mean square error (RMSE) of 0.012, and mean absolute error of 0.008 during training, and R2 of 0.96, RMSE of 0.044, and Kling–Gupta efficiency of 0.98 during testing for predicting temporal scour depth. For non-dimensional maximum scour depth, it reached R2 of 0.99 and RMSE of 0.005 in training, with R2 > 0.91 across all combinations during testing. Although RF showcases commendable accuracy, it slightly lags in precision compared to XGBoost.

  • This study utilizes eXtreme Gradient Boosting (XGBoost), random forest, convolutional neural network–long short-term memory, and artificial neural network, optimized with genetic algorithm, for predicting scour around spur dikes.

  • It highlights the importance of non-dimensional initial parameters in scour prediction.

  • XGBoost achieves R2 values over 0.91, indicating superior performance.

  • Enhanced scour depth predictions support more effective spur dike designs.

  • Cutting-edge applications of ML in hydraulic engineering are demonstrated.

Spur dikes represent crucial components within river training systems, which are designed to address the dynamic alterations induced by river degradation and aggradation (Kothyari & Ranga Raju 2001). These structures are strategically placed either perpendicular or at an angle to the riverbank, extending from one end into the river while securely anchored at the other to protect against erosion and regulate various hydraulic processes (Duan et al. 2009; Choufu et al. 2019; Pourshahbaz et al. 2022; Saikumar et al. 2022). Spur dikes serve multifaceted purposes, including diverting flow and safeguarding riverbanks from flood events, establishing their pivotal role in hydraulic engineering (Kuhnle et al. 2002; Shah et al. 2023). The construction of spur dikes introduces complexities in the flow path, causing shifts in hydrostatic pressure upstream and downstream (Zhang et al. 2009; Yazdi et al. 2010; Nayyer et al. 2019; Pandey et al. 2019; Farshad et al. 2022). Spur dikes have proven to be highly effective in various river management projects, playing a crucial role in stabilizing riverbanks and managing sediment transport (Noret et al. 2013; Bigham 2020). The key applications of spur dikes are in addressing river bank protection, river degradation and aggradation by preventing excessive erosion along riverbanks, which poses a threat to both infrastructure and ecosystems. Real-world case studies further highlight the importance of spur dikes. For instance, along the Mississippi River in the USA, spur dikes are strategically placed to prevent bank erosion while maintaining navigation channels (Klingeman et al. 1984; Barnett 2017). Their controlled positioning manages sediment movement, ensuring riverbank stability and reducing the frequency and cost of dredging operations. In the Rhine River, Europe, they regulate sediment transport, preventing unwanted accumulation in key areas, thus maintaining navigability and protecting riverbanks from erosion (Habersack et al. 2016; Havinga 2020). In China's Yellow River, where high sediment loads cause severe aggradation, spur dikes guide sediment-laden waters into predetermined areas, lowering the risk of catastrophic floods (Peng et al. 2010). In Pakistan's Indus River, where seasonal monsoons and snowmelt lead to severe erosion, the Water and Power Development Authority implemented spur dikes to reduce bank erosion, safeguarding agricultural lands and infrastructure from flood damage (Atta-Ur-Rahman & Shaw 2015). Additionally, in the Nile River, Egypt, where aggradation has threatened irrigation systems, spur dikes manage sediment flow, preventing blockages in irrigation channels and ensuring a stable water supply for agriculture. These examples demonstrate the versatility and critical role of spur dikes in river engineering, not only for stabilizing riverbanks and managing sediment transport but also for mitigating flood risks and supporting agricultural productivity through improved flow management. This alteration initiates the formation of intricate vortex areas, particularly marked by substantial vortices at the head of the spur dikes, thereby constituting the principal local scour mechanism. Local scour poses a substantial threat to the structural integrity of these river training elements, potentially leading to catastrophic failures (Pandey et al. 2018; Gu et al. 2023).

Researchers have employed various techniques to reduce the scour depth around spur dikes (Garde et al. 1961; Gu et al. 2020; Guguloth & Pandey 2023b). These techniques can be broadly categorized into direct and indirect methods. Direct methods involve the utilization of construction materials, such as revetments and riprap, positioned on spill slopes to resist erosion and directly protect structures against flow attack (Lauchlan & Melville 2001; Yılmaz 2014; Gupta et al. 2023). In contrast, indirect methods modify flow patterns through the implementation of specialized structures, such as protective spur dikes, guide banks, or collars, to induce a reduction in local scour (Karami et al. 2011; Pandey et al. 2016; Delavari et al. 2022; Guguloth & Pandey 2023a). Recent attention has been directed toward the use of a protective spur dike as an indirect method. Given that spur dikes are typically constructed consecutively, the first spur dike (SD1) upstream, often referred to as ‘the first spur dike,’ experiences the most destructive flow influence (Pandey 2014). As a result, reinforcing this initial dike becomes imperative in an effort to diminish local scour depth. The introduction of a protective spur dike upstream, especially in a series of parallel spur dikes, alters the flow direction significantly, leading to a substantial reduction in scour depth around the main spur dike (Gu et al. 2023). This reduction is particularly crucial for safeguarding the SD1, which directly faces the oncoming flow (Shah et al. 2023). The temporal variation of scour and the reduction of maximum scour depth around the SD1, place paramount importance on understanding and optimizing the main parameters of a protective spur dike, specifically its length and spacing between protected spur dikes (Pandey et al. 2016; Delavari et al. 2022). In recent years, numerous researchers have conducted both laboratory and field experiments aimed at mitigating scour around spur dikes through various methods, while also examining how scour varies over time (Iqbal et al. 2022; Aung et al. 2023; Gu et al. 2023; Tabassum et al. 2024). Nayyer et al. (2019) explored the flow dynamics near spur dikes of three distinct shapes – namely I, T, and L shapes – using experimental methods and numerical simulations via Flow 3D software. Their investigation analyzed three combinations, namely (ILI), (TLI), and (LTT), revealing that the (LTT) series was most effective in reducing factors such as flow velocity, shear stress, pressure, and turbulence near the spur dikes.

Recent studies have focused on applying AI models to predict temporal variation in scour depth and maximum scour depth around hydraulic structures (Sreedhara et al. 2021; Pandey et al. 2022; Guguloth et al. 2024). Pandey et al. (2020) employed genetic algorithms to estimate time-dependent scour depths around circular bridge piers. Similarly, Azamathulla & Wu (2011) utilized various soft computing techniques to predict scour depths beneath river pipelines, showcasing good agreement with observed data points. Pandey et al. (2022) developed three novel machine-learning techniques, including Gradient Boosting Decision Tree, Kernel Ridge Regression, and Cascaded Forward Neural Network for predicting the depth of the scour hole around the spur dike. Aamir & Ahmad (2019) employed the artificial neural network (ANN) and ANFIS models to predict the non-dimensional maximum scour depth under wall jets downstream of a rigid apron, showcasing superior performance compared to previously proposed empirical and regression methods. Ahmadianfar et al. (2021) developed an artificial intelligence approach for predicting wave-induced local scour depth near circular piles.

The objective of this study is to develop novel hybrid machine-learning models aimed at predicting the temporal variation and maximum scour depth around the SD1 within a series of spur dikes. The study intends to explore four machine-learning models, including eXtreme Gradient Boosting (XGBoost), random forest (RF), convolutional neural network–long short-term memory (CNN–LSTM), and ANN for analyzing the temporal variation of scour depth and non-dimensional maximum scour depth around the SD1 in a series of spur dikes. Utilizing a dataset comprising 465 data points on temporal scour depth variation and 27 experimental data points on maximum scour depth, the study will compare and evaluate the performances of these machine-learning models using graphical tools and statistical metrics. Overall, the study aims to contribute a new approach to quantifying the temporal variation of scour depth around the SD1 within a series of spur dikes through the application of advanced machine-learning techniques.

Data collection

Various significant factors influence the evolution of scour in multiple spur dikes, including (i) flow characteristics: flow depth (y), channel width (B), and velocity (U). (ii) Sediment characteristics: the median size of particles in the sediment bed (D50) and the geometrical standard deviation of sediment (σg). Additionally, factors related to the geometry of the spur dikes, such as the lateral length of the spur dike (L), orientation angle (θ), and spacing (S) between them, play a crucial role. Equation (1) expresses the functional relationship between the key variables, specifically the maximum equilibrium scour (dsemax), and the influencing parameters.
(1)
The laboratory experiments involved altering the length of the spur dike, the spacing between the series of spur dikes, and the flow velocity and depth. Using the Buckingham Pi theorem, we analyzed non-dimensional parameters, considering the length of the spur dike, flow depth, and acceleration due to gravity as variables. Equation (2) was derived to express the non-dimensional parameters, enabling us to study the variation of non-dimensional scour depth over non-dimensional time. For estimating the non-dimensional maximum equilibrium scour depth around the SD1, we considered the spacing between the series of spur dikes, the velocity of the flow and the length of the spur dikes. Based on the aforementioned varied parameters, we concluded the non-dimensional parameters for predicting the maximum equilibrium scour depth.
(2)
(3)
where represents the maximum depth of scour in equilibrium condition, is the scour depth at a specific time, Uc is critical velocity, Fr is densimetric Froude number, , is denotes the density of sediment bed particles, is the density of the water, and g represents the acceleration due to gravity.

Experimental work

The experimental study was carried out in a precisely controlled rectangular hydraulic flume designed to replicate natural flow conditions. The flume had a total length of 10.30 m, a width of 0.8 m, and a height of 0.5 m, with a longitudinal slope of 0.0004 to simulate natural riverbed gradients. The working section was strategically placed 4 m downstream from the inlet section to ensure a fully developed flow before the measurements were taken. The test section itself measured 2.3 m in length, 0.8 m in width, and 0.5 m in depth, providing ample space for accurate flow and scour observations. The flume bed was filled with non-cohesive, uniform sediments characterized by a median particle diameter (D50) of 0.32 mm and a geometric standard deviation (σg) of 1.31, indicative of sediment homogeneity according to the classification by Dey et al. (1995). The relative density of the sediment particles was recorded as 2.65, meaning the sediment was considerably denser than the water, promoting realistic scour processes under controlled conditions.

Three impermeable spur dikes were installed at predetermined intervals within the test section. These dikes were designed with a uniform thickness of 4 mm and varied lateral lengths of 0.15, 0.12, and 0.10 meters, all standing at a height of 0.55 m. They were aligned perpendicular to the primary flow direction to create intentional flow disturbances and induce local scour. The experiments were conducted under three different flow intensities: 23.11, 19.26, and 15.4 l/s, corresponding to average flow velocities of 24.07, 20.6, and 16.05 m/s, respectively. All trials maintained clear water conditions (U/Uc < 1), where the approach velocity (U) remained below the critical velocity (Uc) for sediment movement, ensuring that the scour observed was due solely to the interaction between the flow and spur dikes. Three distinct spacing configurations between the spur dikes, referred to as L, 2L, and 3L, were tested. These varying configurations were critical in studying the influence of dike spacing on flow interference, sediment transport, and the formation of scour holes.

Table 1 provides comprehensive statistical data for both the input parameters (flow rate, and dike configuration) and output parameters (scour depth and velocity distribution). Figure 1 illustrates the experimental setup, offering a clear view of the physical layout and measurement stations. To develop a predictive model for temporal scour patterns, the experimental data were divided into two datasets: 70% of the data was randomly selected for model training, while the remaining 30% was reserved for validation and testing. A total of 33 laboratory experiments, specifically designed to capture time-dependent scour development, were conducted, resulting in 480 individual datasets that meticulously captured the temporal evolution of scour around the spur dikes. To ensure that sediment particles remained undisturbed during the initial flow setup, the flume was gradually filled with water, allowing for the controlled visualization of the scour process. The approach flow velocity was systematically increased until the target average velocity was achieved, with careful monitoring to maintain the appropriate flow depth within the flume. The initial observations revealed that erosion began at the nose of the SD1, triggered by the downward component of the flow and the formation of horseshoe vortices. These vortices intensified the erosive forces acting on the bed, causing a localized scour hole to form around the base of SD1. This process continued until the scour hole reached an equilibrium depth, beyond which further erosion ceased. A high-precision point gauge was employed to monitor the depth of the scour at regular time intervals, particularly on the upstream side of the spur dikes, where the erosive forces were most concentrated. After each experimental run, the water was gradually drained from the flume to preserve the scour holes and sediment patterns around the spur dikes for detailed post-experiment analysis. This careful procedure ensured that the final scour configurations remained intact and undisturbed, allowing for accurate measurements of the scour profiles.
Table 1

Statistical metrics of the experimental data

Statistical parametersU/UcFr
Minimum 0.60 5.33 0.0041 0.147 0.064 0.129 
Maximum 0.90 8.0 0.221 0.6 1.0 
Mean 0.753 1.995 6.67 0.289 0.184 0.336 0.741 
Standard deviation 0.122 0.821 1.077 0.326 0.030 0.178 0.233 
Skewness −0.045 0.0084 −0.021 0.947 0.006 −0.213 −0.653 
Kurtosis −1.49 −1.520 −1.467 −0.527 −1.60 −1.388 −0.636 
Statistical parametersU/UcFr
Minimum 0.60 5.33 0.0041 0.147 0.064 0.129 
Maximum 0.90 8.0 0.221 0.6 1.0 
Mean 0.753 1.995 6.67 0.289 0.184 0.336 0.741 
Standard deviation 0.122 0.821 1.077 0.326 0.030 0.178 0.233 
Skewness −0.045 0.0084 −0.021 0.947 0.006 −0.213 −0.653 
Kurtosis −1.49 −1.520 −1.467 −0.527 −1.60 −1.388 −0.636 
Figure 1

Experimental setup.

Figure 1

Experimental setup.

Close modal

Throughout the early stages of each experimental trial, the scour process exhibited similar patterns across all three spur dikes. However, as the experiments progressed, sediment that had been eroded from the scour hole at the base of SD1 was transported downstream, forming a ridge of deposited material. This ridge-shaped sediment formation subsequently migrated downstream, progressively filling the scour hole at the second spur dike (SD2). Initially, this process of sediment transport between dikes occurred rapidly, but after 2–3 h, the rate of transfer slowed considerably. This reduction in sediment transport is a clear indication of the protective, or ‘shielding,’ effect exerted by SD1 on the downstream spur dikes. Essentially, the presence of SD1 reduced the erosive forces acting on SD2, thereby limiting further scour development at SD2. Additionally, the third spur dike exhibited unique scour patterns, where sediment deposits were observed both upstream and downstream of its position. These deposits were particularly concentrated at the junction between the spur dike and the flume walls, further illustrating the complex interaction between flow structures and sediment transport in the vicinity of spur dikes.

Artificial neural networks

The key hyperparameters (Pannakkong et al. 2022) that impact the performance of the algorithm are presented in Table 2.

Table 2

Hyperparameter information on ANN

NotationHyperparameterInference
 Learning rate It controls the size of the step taken during optimization and affects the speed and quality of convergence during training. Too high of a learning rate can speed, while too low of a learning rate can result in slow convergence 
 Number of hidden layers It determines the depth of the network and its capacity to learn complex relationships in the data. g more hidden layers can increase the model's capacity to capture intricate patterns but also increases the risk of overfitting 
 Number of neurons More neurons can lead to a higher capacity to learn complex patterns, but it also increases the computational complexity of the model 
 Batch size Larger batch sizes can lead to faster convergence but may require more memory, while smaller batch sizes may lead to more noise in the gradient estimates but can sometimes generalize better 
 Epochs Increasing number of epochs can improve the model's performance up to a certain point, but may also increase the risk of overfitting. Proper tuning the number of epochs is crucial for achieving the right balance between underfitting and overfitting 
 Optimizer It impacts the convergence speed and quality of the trained model. Choosing appropriate optimizer aids the network's ability to navigate the loss and identify the optimal parameter values 
 Regularization parameters Regularization parameters control overfitting by penalizing large weights in the network. Lasso and Ridge type regularization improve generalization handling the noisy data 
NotationHyperparameterInference
 Learning rate It controls the size of the step taken during optimization and affects the speed and quality of convergence during training. Too high of a learning rate can speed, while too low of a learning rate can result in slow convergence 
 Number of hidden layers It determines the depth of the network and its capacity to learn complex relationships in the data. g more hidden layers can increase the model's capacity to capture intricate patterns but also increases the risk of overfitting 
 Number of neurons More neurons can lead to a higher capacity to learn complex patterns, but it also increases the computational complexity of the model 
 Batch size Larger batch sizes can lead to faster convergence but may require more memory, while smaller batch sizes may lead to more noise in the gradient estimates but can sometimes generalize better 
 Epochs Increasing number of epochs can improve the model's performance up to a certain point, but may also increase the risk of overfitting. Proper tuning the number of epochs is crucial for achieving the right balance between underfitting and overfitting 
 Optimizer It impacts the convergence speed and quality of the trained model. Choosing appropriate optimizer aids the network's ability to navigate the loss and identify the optimal parameter values 
 Regularization parameters Regularization parameters control overfitting by penalizing large weights in the network. Lasso and Ridge type regularization improve generalization handling the noisy data 

Random forest

RF (Pham et al. 2021), an ensemble learning bagging algorithm, is extensively utilized for both classification and regression tasks due to its robustness and versatility across various applications. As a non-parametric method, it operates without making any assumptions about the data distribution, making it suitable for diverse datasets (Liu et al. 2020). However, relying on a single decision tree with only one split may not yield reliable estimates. To address this limitation, it builds multiple decision trees of varying complexity during training, comprising roots, nodes and aggregates predictions from individual trees to generate final predictions at the leaves.

It ensures diversity and improves generalization ability in the algorithm, which are crucial aspects for reducing overfitting. Their effectiveness stems from its incorporation of randomness through two primary mechanisms: bootstrapping, involving random sampling, and the consideration of random subsets of features at each split within the trees. The key hyperparameters (Rehman et al. 2022) that impact the performance of the algorithm are presented in Table 3.

Table 3

Hyperparameter information on RF

NotationHyperparameterInference
 Number of estimators (n_estimators) Increasing the number of trees generally improves predictive performance, but may lead to longer training times and higher memory consumption 
 Maximum number of features (max_features) Determining the optimal number of features can help control overfitting. A smaller value can increase model diversity and reduce correlation among trees 
 Maximum depth of each tree (max_depth) Constraining tree depth can prevent overfitting but may also lead to underfitting if set too low 
 Minimum number of samples required to a split an internal node (min_samples_split) Increasing this value can make the model more robust to noise but may also lead to underfitting 
 Minimum number of samples required at a leaf node. (min_samples_leaf) Similar to min_samples_split, increasing this value can prevent overfitting but may also lead to underfitting 
NotationHyperparameterInference
 Number of estimators (n_estimators) Increasing the number of trees generally improves predictive performance, but may lead to longer training times and higher memory consumption 
 Maximum number of features (max_features) Determining the optimal number of features can help control overfitting. A smaller value can increase model diversity and reduce correlation among trees 
 Maximum depth of each tree (max_depth) Constraining tree depth can prevent overfitting but may also lead to underfitting if set too low 
 Minimum number of samples required to a split an internal node (min_samples_split) Increasing this value can make the model more robust to noise but may also lead to underfitting 
 Minimum number of samples required at a leaf node. (min_samples_leaf) Similar to min_samples_split, increasing this value can prevent overfitting but may also lead to underfitting 

eXtreme Gradient Boosting

XGBoost (Vogeti et al. 2022) is an ensemble boosting algorithm that constructs decision trees (base learners based on similarity scores), features, and an additive model with the aim of loss function minimization. These are constrained using the number of leaves, nodes, splits, or layers. The additive trees are introduced without replacing the existing trees use a gradient descent procedure to minimize losses. The separation that yields the highest loss reduction is selected. The constructed tree begins at node ‘i’, which is divided into either left or right branches depending on the chosen separation criteria. Now, reduction in loss can be calculated, and the branch having the highest loss reduction is preferred.

Convolutional neural network–long short-term memory

CNN–LSTM (Khorram & Jehbez 2023) is a hybrid algorithm that utilizes the potential of both CNN and LSTM algorithms for establishing relationships between and output variables. Initially, elements of the input data matrix are navigated through the convolution layer which performs matrix multiplication between input and filter matrices, resulting in a convoluted matrix. This matrix is dimensionally reduced using pooling operations capturing the most pertinent information captured from the input layer. The pooled feature map is converted into a one-dimensional vector in this layer. Diverse features learned by the convolution pooling and flattening layers are transformed into a dense vector. The final layer of the workflow is the output layer and is passed as input to the LSTM. Furthermore, this information is passed through three gating units, respectively. The key hyperparameters (Lilhore et al. 2023) that impact the performance of the algorithm are presented in Table 4.

Table 4

Hyperparameter information on CNN–LSTM

NotationHyperparameterInference
 Number of kernels These aid in efficient feature extraction from the given input data 
 Pooling Pooling layers are the layers that can be used in truncating the spatial dimensions of the input data in a network 
 LSTM layer node They are a hidden layer nodes that assists in the transmission of data between gating units 
 Number of neurons The performance of a model is significantly influenced by the number of neurons present in a dense model layer. Overfitting occurs when very high neuron values are considered 
 Batch size The number of samples that are simultaneously transmitted to the network is set by the batch size. Larger batch size captures the intricate patterns of the data incorrectly. 
 Learning rate The learning rate characterizes the size of the changes made to the weights in order to achieve the minimum loss function. Lower learning rates are preferable as they ensure better feature extraction resulting in improved model performance 
 Epochs An epoch refers to a single pass through the entire training process of a neural network. The higher number of epochs ensures better simulation accuracy increasing the overall computation time 
 Dropout It is discarding of a noisy data from a neural network. Removal of noisy data improves performance 
NotationHyperparameterInference
 Number of kernels These aid in efficient feature extraction from the given input data 
 Pooling Pooling layers are the layers that can be used in truncating the spatial dimensions of the input data in a network 
 LSTM layer node They are a hidden layer nodes that assists in the transmission of data between gating units 
 Number of neurons The performance of a model is significantly influenced by the number of neurons present in a dense model layer. Overfitting occurs when very high neuron values are considered 
 Batch size The number of samples that are simultaneously transmitted to the network is set by the batch size. Larger batch size captures the intricate patterns of the data incorrectly. 
 Learning rate The learning rate characterizes the size of the changes made to the weights in order to achieve the minimum loss function. Lower learning rates are preferable as they ensure better feature extraction resulting in improved model performance 
 Epochs An epoch refers to a single pass through the entire training process of a neural network. The higher number of epochs ensures better simulation accuracy increasing the overall computation time 
 Dropout It is discarding of a noisy data from a neural network. Removal of noisy data improves performance 

Hyperparameter tuning using genetic algorithm

To enhance the predictive accuracy of our machine-learning models – RF, XGBoost, ANN, and CNN–LSTM – we employed a systematic approach for hyperparameter tuning. Hyperparameters play a critical role in the performance of machine-learning models, and optimizing them is essential for achieving robust and accurate predictions. In this study, we utilized the genetic algorithm (GA) to efficiently search the hyperparameter space and identify optimal configurations for each model. GA is a metaheuristic optimization technique inspired by the process of natural selection, which mimics the evolutionary principles of selection, crossover, and mutation. This algorithm proves particularly effective in exploring complex and nonlinear parameter spaces, making it well-suited for enhancing the performance of machine-learning models.

The hyperparameters considered for tuning varied across the different models that were presented in the preceding sections. Our objective was to find the set of hyperparameters that maximizes the predictive performance of each model. The GA implementation involved the generation of initial populations of hyperparameter sets, subsequent evaluation of each set using a fitness function based on model performance metrics, and the iterative evolution of populations through genetic operations. This process continued until convergence to an optimal set of hyperparameters.

Statistical performance evaluation

The study employed six key statistical performance evaluation metrics to rigorously assess the predictive capabilities of the machine-learning models. The metrics include R2 (coefficient of determination), providing insights into the proportion of variation in the observed data captured by the model; Kling–Gupta efficiency (KGE), offering a comprehensive measure of goodness-of-fit by evaluating the model's ability to reproduce observed variability, correlation, and bias; root mean square error (RMSE), quantifying the model's average prediction error; mean absolute error, representing the average absolute difference between observed and predicted values; mean absolute percentage error (MAPE), expressing prediction accuracy as a percentage of the observed values; and percentage bias (PBIAS), indicating the model's tendency to systematically overestimate or underestimate the observed values. These metrics collectively provided a robust framework for assessing the models' performance across different combinations of predictors during both training and testing periods, enabling a comprehensive understanding of their accuracy, precision, and generalization capabilities.

Mathematical formulations of these indices are defined as follows:

  • (1) coefficient of determination (R2):
    (17)
  • (2) KGE:
    (18)
  • (3) RMSE
    (19)
  • (4) (MAE)
    (20)
  • (5) MAPE, %
    (21)
  • (6) PBIAS:
    (22)
    where Xi denotes the observed non-dimensional scour depth, Yi represents the corresponding predicted non-dimensional scour depth, and indicate the averages of the observed and predicted values, and n is the total number of observations. stand for the standard deviation of the simulated and observed time series, while, represent the mean of the simulated and observed time series, respectively.

Combination of input parameters

This study focused on predicting the temporal variation of scour depth by employing non-dimensional time , non-dimensional spacing , channel width , and time-averaged velocity as predictors. A preliminary correlation analysis between and the four predictors revealed noteworthy associations, with non-dimensional time exhibiting a particularly strong correlation (>0.8), followed by , , and in descending order. Motivated by these findings, we established four machine-learning models – RF, XGBoost, ANN, and CNN–LSTM. Each model was constructed by considering all four predictors simultaneously (referred to as Combo-1) and subsequently removing the least correlated predictor in three additional combinations (Combo-2, Combo-3, and Combo-4). This iterative approach aimed to differentiate the impact of individual predictors on the predictive performance of the models. Similarly, we extended our predictive analysis to non-dimensional maximum scour depth , utilizing predictors such as , , , and densimetric Froude number . Correlation analysis underscored robust associations, particularly between and with (>0.95), while and exhibited insignificant correlations. Subsequently, machine-learning models were developed for using the same iterative approach with four predictor combinations, highlighting the influential factors governing maximum scour depth dynamics. The data were split into 75% for training and 25% for testing, and the hyperparameters of the ML models were optimized using GA. Subsequently, the performance of these models in both training and testing periods was rigorously assessed through graphical representations and a set of six performance evaluation indicators, providing a comprehensive understanding of the predictive capabilities and generalization of the models.

Prediction of temporal variation of scour depth (dst/dsemax)

The performance of the machine-learning models – RF, XGBoost, ANN, and CNN–LSTM in predicting the temporal variation of scour depth during the training and testing periods is presented, as scatter plots in Figures 2 and 3, respectively. During the training period, the scatter plot of observed versus predicted values for all four combinations (Combo-1 to Combo-4) indicates better performance of RF, XGBoost, and ANN, with R2 > 0.93 across all combinations. Notably, XGBoost exhibits superior performance among the models. However, all four models exhibit a decline in R2 from Combo-1 to Combo-4, emphasizing the significance of each predictor in the temporal variation of scour depth. Conversely, CNN–LSTM displays comparatively lower performance with R2 < 0.8, and the decline in performance is particularly pronounced from Combo-1 to Combo-4 compared to other models. Examining the scatter plots reveals an overprediction trend for all models in all four combinations, especially at lower values of where the observed values are less than 0.4. The overprediction errors exceed 20%, with CNN–LSTM demonstrating a higher discrepancy compared to the other models.
Figure 2

Scatterplots of observed versus predicted temporal variation of scour depth for four combinations of predictors during the training period.

Figure 2

Scatterplots of observed versus predicted temporal variation of scour depth for four combinations of predictors during the training period.

Close modal
Figure 3

Scatterplots of observed versus predicted temporal variation of scour depth for four combinations of predictors during the testing period.

Figure 3

Scatterplots of observed versus predicted temporal variation of scour depth for four combinations of predictors during the testing period.

Close modal

Furthermore, the transferability of accuracy from the training to the testing period is notable. RF, XGBoost, and ANN exhibit high transferability, with XGBoost demonstrating the most consistent performance across all four combinations. In contrast, CNN–LSTM shows poor results during the testing period, indicating a limited ability to generalize its predictions. It is noteworthy that XGBoost maintains stable performance in all four combinations, while the performance of ANN, although strong in Combo-1, experiences a decline in Combo-2 to Combo-4. These graphical illustrations provide valuable insights into the predictive capabilities and generalization performance of the machine-learning models in forecasting the temporal variation of scour depth, emphasizing the role of predictor combinations and the varying effectiveness of different models in capturing the underlying dynamics.

Furthermore, the comprehensive statistical evaluation of model performance for predicting the temporal variation of scour depth reveals notable distinctions among different combinations and models. The statistical performance evaluation measures computed for the predictions were tabulated in Tables 5 and 6 for the training and testing periods, respectively. In Combo-1, XGBoost demonstrates exceptional predictive accuracy with R2 = 0.997, the lowest RMSE of 0.012, and MAE of 0.008, showcasing its superior ability to capture the intricate dynamics of scour depth variation. This outperformance is evident when compared to other models, such as RF with R2 = 0.985 and ANN with R2 = 0.976. Moving to Combo-2, XGBoost maintains its stellar performance, outperforming other models with R2 = 0.966 and RMSE of 0.039, reinforcing its robustness and reliability. As the number of predictors decreases in Combo-3, XGBoost continues to perform well with R2 = 0.947 and the lowest RMSE of 0.049, showcasing its adaptability and effectiveness. The ability of XGBoost to consistently outperform others is emphasized when compared to RF with R2 = 0.939 and ANN with R2 = 0.907. In Combo-4, while XGBoost experiences a slight decrease in performance, it remains the top-performing model with R2 = 0.945 and the lowest RMSE of 0.05, reinforcing its superiority in capturing the temporal variation of scour depth.

Table 5

Statistical performance evaluation measures of predicted values of temporal variation of scour depth during training period

MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.985 0.921 0.029 0.023 3.692 0.182 
XGBoost_Combo-1 0.997 0.996 0.012 0.008 1.225 0.023 
ANN_Combo-1 0.976 0.97 0.033 0.025 4.047 0.104 
CNN–LSTM_Combo-1 0.746 0.677 0.111 0.089 13.809 3.122 
RF_Combo-2 0.959 0.965 0.043 0.034 5.402 0.165 
XGBoost_Combo-2 0.966 0.971 0.039 0.028 4.479 0.03 
ANN_Combo-2 0.888 0.899 0.071 0.057 8.989 −0.019 
CNN–LSTM_Combo-2 0.718 0.695 0.116 0.094 14.487 2.95 
RF_Combo-3 0.939 0.953 0.053 0.042 6.732 0.147 
XGBoost_Combo-3 0.947 0.958 0.049 0.038 6.084 −0.024 
ANN_Combo-3 0.907 0.928 0.065 0.05 8.099 0.038 
CNN–LSTM_Combo-3 0.792 0.775 0.1 0.079 11.752 −2.9 
RF_Combo-4 0.944 0.961 0.051 0.04 6.374 0.211 
XGBoost_Combo-4 0.945 0.956 0.05 0.039 6.239 −0.009 
ANN_Combo-4 0.883 0.901 0.073 0.055 8.928 
CNN–LSTM_Combo-4 0.44 −0.657 0.171 0.136 18.356 −1.459 
MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.985 0.921 0.029 0.023 3.692 0.182 
XGBoost_Combo-1 0.997 0.996 0.012 0.008 1.225 0.023 
ANN_Combo-1 0.976 0.97 0.033 0.025 4.047 0.104 
CNN–LSTM_Combo-1 0.746 0.677 0.111 0.089 13.809 3.122 
RF_Combo-2 0.959 0.965 0.043 0.034 5.402 0.165 
XGBoost_Combo-2 0.966 0.971 0.039 0.028 4.479 0.03 
ANN_Combo-2 0.888 0.899 0.071 0.057 8.989 −0.019 
CNN–LSTM_Combo-2 0.718 0.695 0.116 0.094 14.487 2.95 
RF_Combo-3 0.939 0.953 0.053 0.042 6.732 0.147 
XGBoost_Combo-3 0.947 0.958 0.049 0.038 6.084 −0.024 
ANN_Combo-3 0.907 0.928 0.065 0.05 8.099 0.038 
CNN–LSTM_Combo-3 0.792 0.775 0.1 0.079 11.752 −2.9 
RF_Combo-4 0.944 0.961 0.051 0.04 6.374 0.211 
XGBoost_Combo-4 0.945 0.956 0.05 0.039 6.239 −0.009 
ANN_Combo-4 0.883 0.901 0.073 0.055 8.928 
CNN–LSTM_Combo-4 0.44 −0.657 0.171 0.136 18.356 −1.459 
Table 6

Statistical performance evaluation measures of predicted values of temporal variation of scour depth during testing period

MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.926 0.873 0.061 0.044 6.998 0.128 
XGBoost_Combo-1 0.959 0.979 0.044 0.032 4.921 − 0.349 
ANN_Combo-1 0.952 0.946 0.048 0.035 5.493 − 0.554 
CNN–LSTM_Combo-1 0.667 0.669 0.125 0.099 15.243 2.185 
RF_Combo-2 0.899 0.923 0.069 0.052 8.513 − 1.08 
XGBoost_Combo-2 0.934 0.931 0.056 0.039 6.254 − 1.323 
ANN_Combo-2 0.856 0.818 0.083 0.063 10.156 − 1.397 
CNN–LSTM_Combo-2 0.673 0.677 0.125 0.104 15.884 3.188 
RF_Combo-3 0.898 0.936 0.069 0.05 8.262 − 0.405 
XGBoost_Combo-3 0.916 0.952 0.062 0.045 7.405 − 0.058 
ANN_Combo-3 0.891 0.876 0.072 0.054 8.796 − 1.001 
CNN–LSTM_Combo-3 0.771 0.741 0.107 0.08 11.76 − 3.203 
RF_Combo-4 0.913 0.955 0.064 0.047 7.912 0.492 
XGBoost_Combo-4 0.914 0.955 0.063 0.046 7.78 0.268 
ANN_Combo-4 0.885 0.898 0.073 0.056 9.095 − 0.374 
CNN–LSTM_Combo-4 0.343 − 0.445 0.179 0.14 18.652 − 2.402 
MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.926 0.873 0.061 0.044 6.998 0.128 
XGBoost_Combo-1 0.959 0.979 0.044 0.032 4.921 − 0.349 
ANN_Combo-1 0.952 0.946 0.048 0.035 5.493 − 0.554 
CNN–LSTM_Combo-1 0.667 0.669 0.125 0.099 15.243 2.185 
RF_Combo-2 0.899 0.923 0.069 0.052 8.513 − 1.08 
XGBoost_Combo-2 0.934 0.931 0.056 0.039 6.254 − 1.323 
ANN_Combo-2 0.856 0.818 0.083 0.063 10.156 − 1.397 
CNN–LSTM_Combo-2 0.673 0.677 0.125 0.104 15.884 3.188 
RF_Combo-3 0.898 0.936 0.069 0.05 8.262 − 0.405 
XGBoost_Combo-3 0.916 0.952 0.062 0.045 7.405 − 0.058 
ANN_Combo-3 0.891 0.876 0.072 0.054 8.796 − 1.001 
CNN–LSTM_Combo-3 0.771 0.741 0.107 0.08 11.76 − 3.203 
RF_Combo-4 0.913 0.955 0.064 0.047 7.912 0.492 
XGBoost_Combo-4 0.914 0.955 0.063 0.046 7.78 0.268 
ANN_Combo-4 0.885 0.898 0.073 0.056 9.095 − 0.374 
CNN–LSTM_Combo-4 0.343 − 0.445 0.179 0.14 18.652 − 2.402 

Even though RF consistently demonstrates solid performance across all combinations, with R2 ranging from 0.959 to 0.944, indicating commendable accuracy. However, it falls slightly short of XGBoost in terms of the chosen performance evaluation measures. Similarly, ANN initially performs well in Combo-1 (R2 = 0.976) but experiences a decline in subsequent combinations, with R2 dropping to 0.883 in Combo-4. This sensitivity to changes in predictor combinations underscores the importance of model robustness. CNN–LSTM, on the other hand, consistently lags behind other models, demonstrating a decline in performance across predictor combinations and notably poor results in Combo-4 (R2 = 0.44).

The evaluation of model performance during the testing period provides insights into their generalization capabilities. In Combo-1, during testing, XGBoost once again exhibits superior predictive accuracy with R2 = 0.959, the lowest RMSE of 0.044, and MAE of 0.032. This reinforces XGBoost's robustness, as it outperforms other models, including RF, which shows R2 = 0.926, and ANN with R2 = 0.952. Moreover, XGBoost achieves the highest KGE at 0.979, indicating its excellence in capturing both the variability and pattern of the observed data during the testing period. However, it is worth noting that CNN–LSTM continues to struggle, with R2 = 0.667, and KGE at 0.669, indicating limitations in capturing the temporal variation of scour depth during the testing period. The performance in the remaining combinations follows a similar pattern to their training period results. In summary, during both training and testing periods, XGBoost consistently outperforms other models across all combinations, showcasing its superior predictive capabilities, stability, and adaptability. Its ability to achieve the highest R2 values, lowest RMSE, and competitive values for other metrics in both periods underscores its robustness in capturing the complex dynamics of scour depth variation. RF consistently performs well, demonstrating commendable accuracy in both training and testing. However, it falls slightly short of XGBoost in terms of R2, RMSE, and KGE. ANN exhibits good performance during training but experiences a decline during testing, highlighting potential challenges in generalization. CNN–LSTM consistently lags behind other models, struggling to capture the intricate patterns of scour depth variation, particularly during testing.

The analysis of relative deviations in the predictions, plotted against the observed during both the training and testing periods, is presented in Figures 4 and 5, respectively. These figures provide insights into the error distribution at different magnitudes of . At lower magnitudes (<0.2) of , where the relative deviation exceeds 100%, all models exhibit an overestimation bias. However, when , XGBoost stands out as the best-performing model, demonstrating a good performance with a relative deviation below 30%. In contrast, CNN–LSTM shows an increasing deviation from Combo-1 to Combo-4, reaching deviations exceeding 100% for a substantial number of points in Combo-4. This pattern is consistent during the testing period, similar to the training period, with XGBoost displaying better performance and CNN–LSTM struggling to capture the variability in .
Figure 4

Scattered relative deviation for the predicted values of temporal variation of scour depth during the training period.

Figure 4

Scattered relative deviation for the predicted values of temporal variation of scour depth during the training period.

Close modal
Figure 5

Scattered relative deviation for the predicted values of temporal variation of scour depth during the testing period.

Figure 5

Scattered relative deviation for the predicted values of temporal variation of scour depth during the testing period.

Close modal

Prediction of non-dimensional maximum scour depth (dse/L)

The scatter plots of observed versus predicted non-dimensional maximum scour depth during the training and testing periods are presented in Figures 6 and 7, respectively. The performance of the models was quite similar to their performance in the prediction of temporal variation of scour depth. During the training period, XGBoost outperformed other models followed by RF and ANN and the performance of CNN–LSTM is even poorer. It is noticed that in Combo-1 even though, XGBoost and RF demonstrated better performance, at Combo-4, RF, XGBoost and ANN have similar performance. During the testing period, the patterns are similar to the training period with XGBoost being the best model. It can be seen that the R2 of CNN–LSTM is Combo-3 and Combo-4 is comparatively much higher than its performance in Combo-1 and Combo-2. This is due to the fact that R2 only measures the match between temporal dynamics of observed versus predicted values neglecting the biases. In Combo-3 and 4, it is clear that all predictions are completely biased.
Figure 6

Scatterplots of observed versus predicted non-dimensional maximum scour depth for four combinations of predictors during the training period.

Figure 6

Scatterplots of observed versus predicted non-dimensional maximum scour depth for four combinations of predictors during the training period.

Close modal
Figure 7

Scatterplots of observed versus predicted non-dimensional maximum scour depth for four combinations of predictors during the testing period.

Figure 7

Scatterplots of observed versus predicted non-dimensional maximum scour depth for four combinations of predictors during the testing period.

Close modal

The performance assessment metrics computed against the observed versus predicted non-dimensional maximum scour depth were tabulated in Tables 7 and 8 for training and testing periods, respectively. During the training period, XGBoost consistently outshone other models across various predictor combinations, achieving exceptional R2 values, notably R2 = 0.999 in Combo-1 and R2 consistently above 0.98 in subsequent combinations. XGBoost demonstrated minimal RMSE and MAE, showcasing its robust fit to the observed non-dimensional maximum scour depth. RF also performed commendably, maintaining good accuracy and stability, although slightly below XGBoost in terms of precision. In contrast, both ANN and CNN–LSTM struggled to generalize effectively, with CNN–LSTM consistently exhibiting challenges in capturing the dynamics of scour depth.

Table 7

Statistical performance evaluation measures of predicted values of non-dimensional maximum scour depth during training period

MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.972 0.945 0.027 0.022 11.346 −0.481 
XGboost_Combo-1 0.999 0.997 0.005 0.004 1.649 0.073 
ANN_Combo-1 0.929 0.811 0.047 0.036 15.493 −1.864 
CNN–LSTM_Combo-1 0.026 −2.483 0.159 0.127 37.067 1.177 
RF_Combo-2 0.987 0.974 0.019 0.015 6.421 1.028 
XGboost_Combo-2 0.99 0.989 0.016 0.013 5.132 0.841 
ANN_Combo-2 0.929 0.811 0.047 0.036 15.624 −1.637 
CNN–LSTM_Combo-2 0.054 −3.698 0.19 0.167 70.497 44.735 
RF_Combo-3 0.941 0.923 0.039 0.032 16.009 −0.599 
XGboost_Combo-3 0.941 0.927 0.039 0.032 15.98 −0.613 
ANN_Combo-3 0.94 0.923 0.04 0.032 15.991 −0.73 
CNN–LSTM_Combo-3 0.04 −20.331 0.216 0.191 94.196 71.751 
RF_Combo-4 0.941 0.894 0.04 0.033 16.075 −1.103 
XGboost_Combo-4 0.941 0.927 0.039 0.032 15.98 −0.613 
ANN_Combo-4 0.929 0.878 0.044 0.035 16.382 −1.277 
CNN–LSTM_Combo-4 0.145 −2.39 0.431 0.404 53.923 −53.631 
MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.972 0.945 0.027 0.022 11.346 −0.481 
XGboost_Combo-1 0.999 0.997 0.005 0.004 1.649 0.073 
ANN_Combo-1 0.929 0.811 0.047 0.036 15.493 −1.864 
CNN–LSTM_Combo-1 0.026 −2.483 0.159 0.127 37.067 1.177 
RF_Combo-2 0.987 0.974 0.019 0.015 6.421 1.028 
XGboost_Combo-2 0.99 0.989 0.016 0.013 5.132 0.841 
ANN_Combo-2 0.929 0.811 0.047 0.036 15.624 −1.637 
CNN–LSTM_Combo-2 0.054 −3.698 0.19 0.167 70.497 44.735 
RF_Combo-3 0.941 0.923 0.039 0.032 16.009 −0.599 
XGboost_Combo-3 0.941 0.927 0.039 0.032 15.98 −0.613 
ANN_Combo-3 0.94 0.923 0.04 0.032 15.991 −0.73 
CNN–LSTM_Combo-3 0.04 −20.331 0.216 0.191 94.196 71.751 
RF_Combo-4 0.941 0.894 0.04 0.033 16.075 −1.103 
XGboost_Combo-4 0.941 0.927 0.039 0.032 15.98 −0.613 
ANN_Combo-4 0.929 0.878 0.044 0.035 16.382 −1.277 
CNN–LSTM_Combo-4 0.145 −2.39 0.431 0.404 53.923 −53.631 
Table 8

Statistical performance evaluation measures of predicted values of non-dimensional maximum scour depth during testing period

MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.965 0.856 0.044 0.038 23.07 0.579 
XGBoost_Combo-1 0.99 0.878 0.029 0.024 12.904 1.385 
ANN_Combo-1 0.942 0.693 0.066 0.059 28.878 −6.191 
CNN–LSTM_Combo-1 0.131 −2.419 0.245 0.227 62.05 −25.85 
RF_Combo-2 0.99 0.938 0.023 0.017 8.015 2.421 
XGBoost_Combo-2 0.992 0.961 0.02 0.015 8.131 1.628 
ANN_Combo-2 0.942 0.718 0.066 0.059 28.212 −7.819 
CNN–LSTM_Combo-2 0.102 −4.694 0.216 0.188 77.453 7.852 
RF_Combo-3 0.943 0.857 0.052 0.044 27.328 0.034 
XGBoost_Combo-3 0.943 0.86 0.052 0.044 27.36 −0.072 
ANN_Combo-3 0.944 0.854 0.052 0.045 27.513 −0.763 
CNN–LSTM_Combo-3 0.763 −7.059 0.198 0.162 75.76 35.575 
RF_Combo-4 0.943 0.822 0.054 0.046 27.588 −1.744 
XGBoost_Combo-4 0.943 0.86 0.052 0.044 27.36 −0.072 
ANN_Combo-4 0.942 0.789 0.058 0.053 28.417 −4.807 
CNN–LSTM_Combo-4 0.762 0.102 0.489 0.476 66.364 −63.471 
MODELR2KGERMSEMAEMAPEPBIAS
RF_Combo-1 0.965 0.856 0.044 0.038 23.07 0.579 
XGBoost_Combo-1 0.99 0.878 0.029 0.024 12.904 1.385 
ANN_Combo-1 0.942 0.693 0.066 0.059 28.878 −6.191 
CNN–LSTM_Combo-1 0.131 −2.419 0.245 0.227 62.05 −25.85 
RF_Combo-2 0.99 0.938 0.023 0.017 8.015 2.421 
XGBoost_Combo-2 0.992 0.961 0.02 0.015 8.131 1.628 
ANN_Combo-2 0.942 0.718 0.066 0.059 28.212 −7.819 
CNN–LSTM_Combo-2 0.102 −4.694 0.216 0.188 77.453 7.852 
RF_Combo-3 0.943 0.857 0.052 0.044 27.328 0.034 
XGBoost_Combo-3 0.943 0.86 0.052 0.044 27.36 −0.072 
ANN_Combo-3 0.944 0.854 0.052 0.045 27.513 −0.763 
CNN–LSTM_Combo-3 0.763 −7.059 0.198 0.162 75.76 35.575 
RF_Combo-4 0.943 0.822 0.054 0.046 27.588 −1.744 
XGBoost_Combo-4 0.943 0.86 0.052 0.044 27.36 −0.072 
ANN_Combo-4 0.942 0.789 0.058 0.053 28.417 −4.807 
CNN–LSTM_Combo-4 0.762 0.102 0.489 0.476 66.364 −63.471 

The testing period further confirmed XGBoost's superiority, with consistent R2 values above 0.99 and low RMSE across all combinations. RF also performed well during testing, closely following XGBoost. ANN showed reasonable accuracy but experienced a decline in precision during testing. CNN–LSTM continued to face challenges, with lower R2 values and higher error metrics, indicating limitations in capturing non-dimensional maximum scour depth dynamics. The comprehensive analysis across training and testing periods underscores XGBoost as the preferred model for predicting non-dimensional maximum scour depth. Its consistent excellence, stability, and adaptability make it a robust choice for hydraulic engineering applications. While RF also demonstrated commendable performance, the marginal edge of XGBoost in accuracy and stability positions it as the most effective model for this specific prediction task. The challenges faced by ANN and CNN–LSTM in capturing the dynamics persist during both training and testing periods, emphasizing the critical importance of model selection for accurate predictions in hydraulic engineering scenarios.

The relative deviations in the predictions plotted against the observed during both the training and testing periods are presented in Figures 8 and 9, respectively. During the training period, the relative deviation when observed is quite high exceeding 100% in all models. When > 0.2, the relative deviation of XGBoost, RF, and ANN is quite low, whereas for CNN–LSTM it is still high, ranging from 30 to 100%. In Combo-1, XGB outperforms other models with minimal deviation and with a decreasing number of predictors the performance of XGBoost is declined. During the testing period, similar patterns as of the training period were noticed. The performance discrepancy of the CNN–LSTM model, compared to RF, XGBoost, and ANN, with a small dataset could be attributed to several factors. Unlike RF, XGBoost, and ANN, CNN–LSTM is a hybrid model that combines (CNN and LSTM architectures, which might introduce increased complexity and demand more data to capture the intricate temporal variations and spatial patterns associated with scour depth. Additionally, the CNN–LSTM model may require a larger dataset to effectively learn the hierarchical features and dependencies within the input data, especially considering the convolutional layers' spatial hierarchies and the temporal dependencies modeled by LSTM. In scenarios with limited data, the CNN–LSTM model might struggle to generalize well, resulting in suboptimal predictions.
Figure 8

Scattered relative deviation for the predicted values of non-dimensional maximum scour depth during the training period.

Figure 8

Scattered relative deviation for the predicted values of non-dimensional maximum scour depth during the training period.

Close modal
Figure 9

Scattered relative deviation for the predicted values of non-dimensional maximum scour depth during the testing period.

Figure 9

Scattered relative deviation for the predicted values of non-dimensional maximum scour depth during the testing period.

Close modal

In this study, the channel shape is rectangular, though in practice, channels can vary in shape or be compound. The experiments used uniform sand with a median size of 0.32 mm and a standard deviation of 1.31 mm, conducted under clear water conditions with steady, uniform flow. The water depth was consistently maintained at 12 cm for all tests. Scour due to contraction effects was neglected when the width of channel (b) was 20% or less of the total channel width (B). Additionally, all models, including RF and XGBoost, consistently overpredicted scour depth at lower values, especially when the observed non-dimensional scour depth was below 0.4, suggesting that the predictors struggled to capture the complexity of scour processes at low depths. CNN–LSTM faced further limitations due to the dataset size, as its complex architecture combining convolutional layers with LSTM units requires larger datasets to accurately capture temporal and spatial patterns, leading to higher errors with limited data.

Future research to enhance scour depth prediction accuracy could focus on expanding datasets to include diverse field data, combining physical-based models with machine learning for improved predictions. Incorporating unsteady flow conditions, refining feature selection, and adding more hydrodynamic variables like turbulence and sediment gradation could also improve model accuracy and generalization. These efforts would lead to more accurate and practical scour depth predictions for real-world applications.

In the present study, the temporal variation of scour depth and non-dimensional maximum scour depth by leveraging machine-learning models. The investigation began with a preliminary correlation analysis, revealing strong associations between scour depth and non-dimensional time , non-dimensional spacing , channel width , time-averaged velocity , and densimetric Froude number . Motivated by these correlations, four machine-learning models – RF, XGBoost, ANN, and CNN–LSTM – were established using an iterative approach involving four predictor combinations (Combo-1 to Combo-4). The following conclusions were drawn from the study:

  • For temporal variation of scour depth, XGBoost achieved the best performance across all predictor combinations. In Combo-1, XGBoost achieved an R² of 0.997 with the lowest RMSE of 0.012 and MAE of 0.008 during training, and R² of 0.959, RMSE of 0.044, and KGE of 0.979 during testing. RF followed closely with R² of 0.985 in training and R² of 0.926 in testing, but XGBoost's ability to capture intricate dynamics made it superior.

  • For non-dimensional maximum scour depth, XGBoost again outperformed the other models. In Combo-1, it achieved R² = 0.999 and RMSE = 0.005 during training, and maintained R² > 0.91 across all combinations during testing, demonstrating its robustness in capturing maximum scour depth dynamics. RF also performed well but slightly lagged behind with R² values consistently lower than XGBoost by a small margin.

  • Model performance declined with fewer predictors, as evidenced by the decrease in R² and KGE from Combo-1 to Combo-4 across all models. The R² of XGBoost-based predictions dropped from 0.997 in Combo-1 to 0.945 in Combo-4 for temporal variation prediction, while RF experienced a similar drop from 0.985 to 0.944.

  • ANN initially performed well but experienced a decline in subsequent combinations, highlighting the importance of model robustness. Conversely, CNN–LSTM consistently lagged behind, displaying a decline in performance across predictor combinations, particularly poor results in Combo-4 during prediction of both and .

  • In the analysis of relative deviations, XGBoost demonstrated its superiority, particularly at when and , with a relative deviation below 30%, while other models exhibited overestimation biases at lower magnitudes.

In conclusion, XGBoost stands out as the preferred model for predicting scour depth dynamics, showcasing robustness, stability, and superior performance. The study offers valuable insights into the impact of predictor combinations on model performance and emphasizes the critical role of model selection in achieving accurate predictions of and .

Some or all data, models, or code that support the findings of this study are available from the corresponding author upon reasonable request.

The authors declare there is no conflict.

Aamir
M.
&
Ahmad
Z.
(
2019
)
Estimation of maximum scour depth downstream of an apron under submerged wall jets
,
Journal of Hydroinformatics
,
21
(
4
),
523
540
.
Ahmadianfar
I.
,
Jamei
M.
&
Chu
X.
(
2021
)
Prediction of local scour around circular piles under waves using a novel artificial intelligence approach
,
Marine Georesources & Geotechnology
,
39
(
1
),
44
55
.
Atta-Ur-Rahman
&
Shaw
R.
(
2015
)
Flood risk and reduction approaches in Pakistan
,
Disaster Risk Reduction Approaches in Pakistan
, pp.
77
100
.
Azamathulla
H. M.
&
Wu
F.-C.
(
2011
)
Support vector machine approach for longitudinal dispersion coefficients in natural streams
,
Applied Soft Computing
,
11
(
2
),
2902
2905
.
Barnett
J. F.
(
2017
)
Beyond Control: The Mississippi River's New Channel to the Gulf of Mexico
.
Univ. Press of Mississippi: Oxford, MS, USA, 2017
.
Choufu
L.
,
Abbasi
S.
,
Pourshahbaz
H.
,
Taghvaei
P.
&
Tfwala
S.
(
2019
)
Investigation of flow, erosion, and sedimentation pattern around varied groynes under different hydraulic and geometric conditions: A numerical study
,
Water (Switzerland)
,
11
(
2
), 235.
https://doi.org/10.3390/w11020235
.
Delavari
E.
,
Saadat
M.
&
Basirat
S.
(
2022
)
Scour depth prediction around bridge abutment protected by spur dike using soft computing tools and regression methods
,
Journal of Hydraulic Structures
7
(
4
),
10
25
.
https://doi.org/10.22055/jhs.2022.38733.1192
.
Dey, S., Bose, S. K. & Sastry, G. L. (1995) Clear water scour at circular piers: a model. Journal of Hydraulic Engineering, 121 (12), 869–876
.
Duan
J. G.
,
He
L.
,
Fu
X.
&
Wang
Q.
(
2009
)
Mean flow and turbulence around experimental spur dike
,
Advances in Water Resources
,
32
(
12
),
1717
1725
.
https://doi.org/10.1016/j.advwatres.2009.09.004
.
Farshad
R.
,
Kashefipour
S. M.
,
Ghomeshi
M.
&
Oliveto
G.
(
2022
)
Temporal scour variations at permeable and angled spur dikes under steady and unsteady flows
,
Water (Switzerland)
,
14
(
20
), 3310.
https://doi.org/10.3390/w14203310
.
Garde
R. J.
,
Subramanya
K.
&
Nambudripad
K. D.
(
1961
)
Study of scour around spur-dikes
,
Journal of The Hydraulics Division
,
87
(
6
),
23
37
.
Gu
Z.
,
Cao
X.
,
Gu
Q.
&
Lu
W. Z.
(
2020
)
Exploring proper spacing threshold of non-submerged spur dikes with ipsilateral layout
,
Water (Switzerland)
,
12
(
1
),
1
13
.
https://doi.org/10.3390/w12010172
.
Gu
Z.
,
Cao
X.
,
Cao
M.
&
Lu
W.
(
2023
)
Integrative study on flow characteristics and impact of non-Submerged double spur dikes on the river system
,
International Journal of Environmental Research and Public Health
,
20
(
5
),
4262
.
Guguloth
S.
&
Pandey
M.
(
2023a
)
A review of literature on the scour process under different jets conditions
,
Journal of Irrigation and Drainage Engineering
,
149
(
10
),
4023022
.
Guguloth
S.
&
Pandey
M.
(
2023b
)
Accuracy evaluation of scour depth equations under the submerged vertical jet
,
AQUA-Water Infrastructure, Ecosystems and Society
,
72 (4), 557–575. doi: https://doi.org/10.2166/aqua.2023.015
.
Guguloth
S.
,
Pandey
M.
&
Pal
M.
(
2024
)
Application of hybrid AI models for accurate prediction of scour depths under submerged circular vertical jet
,
Journal of Hydrologic Engineering
,
29
(
3
),
4024010
.
Gupta
L. K.
,
Pandey
M.
,
Raj
P. A.
&
Shukla
A. K.
(
2023
)
Fine sediment intrusion and its consequences for river ecosystems: A review
,
Journal of Hazardous, Toxic, and Radioactive Waste
,
27
(
1
),
4022036
.
Habersack
H.
,
Hein
T.
,
Stanica
A.
,
Liska
I.
,
Mair
R.
,
Jäger
E.
,
Hauer
C.
&
Bradley
C.
(
2016
)
Challenges of river basin management: Current status of, and prospects for, the river Danube from a river engineering perspective
,
Science of The Total Environment
,
543
,
828
845
.
Iqbal
S.
,
Pasha
G. A.
,
Ghani
U.
,
Ahmed
A.
,
Farooq
R.
&
Haider
R.
(
2022
)
Investigation of flow dynamics around a combination of different head shapes of spur dikes
,
Tehnički Vjesnik
,
29
(
6
),
2111
2120
.
Karami
H.
,
Ardeshir
A.
,
Behzadian
K.
&
Ghodsian
M.
(
2011
)
Protective spur dike for scour mitigation of existing spur dikes
,
Journal of Hydraulic Research
,
49
(
6
),
809
813
.
Khorram
S.
&
Jehbez
N.
(
2023
)
A hybrid CNN
LSTM approach for monthly reservoir inflow forecasting
,
Water Resources Management
, 37,
4097
4121
.
Klingeman
P. C.
,
Kehe
S. M.
&
Owusu
Y. A.-B.
(
1984
)
Streambank Erosion Protection and Channel Scour Manipulation Using Rockfill Dikes and Gabions
, Vol.
98
.
OR, USA
:
Water Resources Research Institute, Oregon State University Corvallis
.
Kothyari
U. C.
&
Ranga Raju
K. G.
(
2001
)
Scour around spur dikes and bridge abutments
,
Journal of Hydraulic Research
,
39
(
4
),
367
374
.
Kuhnle
R. A.
,
Alonso
C. V.
&
Shields
F. D.
(
2002
)
Local scour associated with angled spur dikes
, (December),
1087
1093
.
Lauchlan
C. S.
&
Melville
B. W.
(
2001
)
Riprap protection at bridge piers
,
Journal of Hydraulic Engineering
,
127
(
5
),
412
418
.
Lilhore
U. K.
,
Dalal
S.
,
Faujdar
N.
,
Margala
M.
,
Chakrabarti
P.
,
Chakrabarti
T.
,
Simaiya
S.
,
Kumar
P.
,
Thangaraju
P.
&
Velmurugan
H.
(
2023
)
Hybrid CNN
LSTM model with efficient hyperparameter tuning for prediction of Parkinson's disease
,
Scientific Reports
,
13
(
1
),
14605
.
Liu
F.
,
Xu
W.
,
Lu
J.
,
Zhang
G.
,
Gretton
A.
&
Sutherland
D. J.
(
2020
)
Learning deep kernels for non-parametric two-sample tests
,
International Conference on Machine Learning
, 119,
6316
6326
.
Nayyer
S.
,
Farzin
S.
,
Karami
H.
&
Rostami
M.
(
2019
)
A numerical and experimental investigation of the effects of combination of spur dikes in series on a flow field
,
Journal of the Brazilian Society of Mechanical Sciences and Engineering
,
41
,
1
11
.
Noret
C.
,
Girard
J.-C.
,
Munodawafa
M. C.
&
Mazvidza
D. Z.
(
2013
)
Kariba dam on Zambezi river: Stabilizing the natural plunge pool
,
La Houille Blanche
,
1
,
34
41
.
Pandey
M.
(
2014
). '
Scour and flow pattern around single and multiple spur dikes
’,
Scour Around Bridge Pier in Gravel Bed Streams View Project Application of Microsoft Excel in Water Resource Engineering View Project
.
Pandey
M.
,
Ahmad
Z.
&
Sharma
P. K.
(
2016
)
Estimation of maximum scour depth near a spur dike
,
Canadian Journal of Civil Engineering
,
43
(
3
),
270
278
.
Pandey
M.
,
Ahmad
Z.
&
Sharma
P. K.
(
2018
)
Scour around impermeable spur dikes: A review
,
ISH Journal of Hydraulic Engineering
,
24
(
1
),
25
44
.
https://doi.org/10.1080/09715010.2017.1342571
.
Pandey
M.
,
Lam
W. H.
,
Cui
Y.
,
Khan
M. A.
,
Singh
U. K.
&
Ahmad
Z.
(
2019
)
Scour around spur dike in sand–gravel mixture bed
,
Water
,
11
(
7
),
1417
.
Pandey
M.
,
Zakwan
M.
,
Sharma
P. K.
&
Ahmad
Z.
(
2020
)
Multiple linear regression and genetic algorithm approaches to predict temporal scour depth near circular pier in non-cohesive sediment
,
ISH Journal of Hydraulic Engineering
,
26
(
1
),
96
103
.
https://doi.org/10.1080/09715010.2018.1457455
.
Pandey
M.
,
Jamei
M.
,
Ahmadianfar
I.
,
Karbasi
M.
,
Lodhi
A. S.
&
Chu
X.
(
2022
)
Assessment of scouring around spur dike in cohesive sediment mixtures: A comparative study on three rigorous machine learning models
,
Journal of Hydrology
,
606
,
127330
.
Pannakkong
W.
,
Thiwa-Anont
K.
,
Singthong
K.
,
Parthanadee
P.
&
Buddhakulsomsiri
J.
(
2022
)
Hyperparameter tuning of machine learning algorithms using response surface methodology: A case study of ANN, SVM, and DBN
,
Mathematical Problems in Engineering
,
2022
,
1
17
.
Pham
L. T.
,
Luo
L.
&
Finley
A.
(
2021
)
Evaluation of random forests for short-term daily streamflow forecasting in rainfall-and snowmelt-driven watersheds
,
Hydrology and Earth System Sciences
,
25
(
6
),
2997
3015
.
Pourshahbaz
H.
,
Abbasi
S.
,
Pandey
M.
,
Pu
J. H.
,
Taghvaei
P.
&
Tofangdar
N.
(
2022
)
Morphology and hydrodynamics numerical simulation around groynes
,
ISH Journal of Hydraulic Engineering
,
28
(
1
),
53
61
.
https://doi.org/10.1080/09715010.2020.1830000
.
Rehman, K., Wang, Y. C, Waseem, M. & Hong, S. H. (2022) Tree-based machine learning models for prediction of bed elevation around bridge piers. Physics of Fluids, 34 (8).
Saikumar, G., Pandey, M., Dikshit, P. K. S. (2023). Natural River Hazards: Their Impacts and Mitigation Techniques. In: Pandey, M., Azamathulla, H. & Pu, J. H. (eds) River Dynamics and Flood Hazards. Disaster Resilience and Green Growth. Springer, Singapore
. https://doi.org/10.1007/978-981-19-7100-6_1.
Shah
K. C.
,
Patel
H. K.
&
Kumar
B.
(
2023
)
Local Scouring Around Rectangular-Shaped Spur Dike with Downward Seepage
.
Sreedhara
B. M.
,
Patil
A. P.
,
Pushparaj
J.
,
Kuntoji
G.
&
Naganna
S. R.
(
2021
)
Application of gradient tree boosting regressor for the prediction of scour depth around bridge piers
,
Journal of Hydroinformatics
,
23
(
4
),
849
863
.
https://doi.org/10.2166/hydro.2021.011
.
Tabassum
R.
,
Guguloth
S.
,
Gondu
V. R.
&
Zakwan
M.
(
2024
)
Scour depth dynamics in varied spacing spur dike configurations: A comprehensive analysis
,
Physics and Chemistry of the Earth, Parts A/B/C
,
135
,
103638
.
Vogeti
R. K.
,
Mishra
B. R.
&
Raju
K. S.
(
2022
)
Machine learning algorithms for streamflow forecasting of lower Godavari basin
,
H2Open Journal
,
5
(
4
),
670
685
.
Yazdi
J.
,
Sarkardeh
H.
,
Azamathulla
H. M.
&
Ghani
A. A.
(
2010
)
3D simulation of flow around a single spur dike with free-surface flow
,
International Journal of River Basin Management
,
8
(
1
),
55
62
.
https://doi.org/10.1080/15715121003715107
.
Yılmaz
K.
(
2014
)
Application of collars as a scour countermeasure
for spill-through abutments/[M.S. - Master of Science]
.
Middle East Technical University
.
Zhang
H.
,
Nakagawa
H.
,
Kawaike
K.
&
Yasuyuki
B.
(
2009
)
Experiment and simulation of turbulent flow in local scour around a spur dyke
,
International Journal of Sediment Research
,
24
(
1
),
33
45
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).