Scouring around a bridge pier involves removing sediment from the riverbed and banks due to water flow. This paper employs eXtreme Gradient Boosting (XGBoost) and support vector machine with particle swarm optimization (SVM-PSO) machine learning (ML) approaches to model the temporal local scour depth around bridge piers under clear water scouring (CWS) conditions. CWS datasets, incorporating bridge pier geometry, flow characteristics, and sediment properties, are collected from existing literature. Five non-dimensional influencing parameters, such as ratio of pier width to flow depth (b/y), ratio of approach mean velocity to critical velocity (V/Vc), ratio of mean sediment size to pier width (d50/b), Froude number (Fr), and standard deviation of sediment (σg), are chosen as input parameters. XGBoost and SVM-PSO ML models demonstrate superior predictive capabilities, achieving coefficient of determination (R2) values exceeding 0.90 and mean absolute percentage error (MAPE) and root mean square error (RMSE) values less than 17.07% and 0.0341, respectively. Comparison with the previous four empirical models based on statistical indices reveals that the proposed XGBoost model outperforms SVM-PSO and empirical models in predicting scour depth, so it is recommended for estimating clear water scour depth under varying temporal conditions within the specified dataset range.

  • A wide range of CWS datasets under temporal conditions is collected to develop the model.

  • The best combination of input parameters affecting the clear water scour depth is selected using the gamma test.

  • A model is built to predict the CWS depth using the eXtreme Gradient Boosting (XGBoost) and support vector machine (SVM)–particle swarm optimization (PSO) machine learning (ML) approaches.

  • The XGBoost ML approach is recommended based on the statistical indices, achieving more than 90% accuracy.

Scouring is a process by which the abrasive force of water flow removes bed sediment in rivers. Minimizing the scouring phenomenon around bridge piers and foundations is crucial in the design and construction of bridges, as failure to do so has caused bridge collapses. Ensuring the safety and cost-effectiveness of bridge construction necessitates accurately estimating scour depth around the bridge pier. Determining the scour depth involves evaluating the scour phenomenon around bridge piers. The presence of bridge piers obstructs the flow route, resulting in the erosion of bed sediment and subsequent local scouring around the piers. According to Kothyari et al. (1992), this obstruction accelerates the flow and induces vortices that remove sediment from the vicinity of the bridge pier (Baranwal et al. 2023; Baranwal & Das 2023). Scour at bridge piers can take considerable time to reach equilibrium, especially when the flow velocity is close to the critical velocity required for the initial motion of sediment particles. Ettema et al. (1998) have observed that the temporal evolution of clear water scouring follows an asymptotic trend from zero to maximum scour depth, typically characterized by three phases: initial, primary, and equilibrium periods. Hydraulic engineers encounter significant challenges when excessive local scouring during floods leads to bridge collapses. The intricate scouring phenomena at bridge sites result from various factors, including debris flow, human activities, alterations in flow patterns around bridge structures, and localized scouring in conjunction with general riverbed sediment movement. Excessive scour depth can lead to undermining shallow foundations, exposing them to flow, which poses a risk to bridge safety. Conversely, overestimating scour depth could result in unnecessary project costs.

Numerous studies have been conducted to estimate scour depth around bridge piers (Sheppard et al. 2004; Lee & Sturm 2009; Pandey et al. 2018; Baranwal & Das 2023; Baranwal & Das 2024), and the influence of flow variability on bridge pier scour has been examined (Kothyari et al. 1992; Chang et al. 2004). This research has predominantly focused on scouring around piers with uniform cross-sections under temporal conditions. The significance of temporal variations in scour depth around such uniform piers has been investigated by Yanmaz & Altinbilek (1991), Kothyari et al. (1992), Melville & Chiew (1999), and Oliveto & Hager (2002). However, fewer studies exist on time-based differences in scour around rounded bridge piers. In recent years, various unconventional methods have been proposed to model bridge pier scouring. For instance, a study by Bateni et al. (2007) utilized neural networks and neuro-fuzzy evaluations to model the complex scouring near bridge piers. It was observed that pier width (b) significantly influences equilibrium scour depth compared to other input parameters. The generalized regression neural network (GRNN) model demonstrated superior performance in modelling scour depth and highlighted the effectiveness of the GRNN model in capturing the complexities of scour processes. However, it does not offer specific insights into its applicability under distinct flow regimes (Firat 2009; Firat & Gungor 2009). Support vector regression (SVR) tuned with radial basis function has shown superiority in predicting scour depth and identifying flow depth (y) and pier width (b) as crucial influencing factors for predicting scour depth around bridge piers. Artificial neural network (ANN) models exhibited the highest level of accuracy in predicting scour depth under both clear water scouring (CWS) and live bed scouring (LBS) conditions (Baranwal & Das 2023). However, the same level of performance was not observed when applied to field data, suggesting that ANN models may not be as effective in real-world scenarios. Selecting a diverse range of datasets is necessary to improve the accuracy of scour depth predictive models (Etemad-Shahidi et al. 2015). Additionally, a genetic function-based model outperformed both ANN and regression-based models, specifically in predicting clear water scour depth.

Particle swarm optimization (PSO) has demonstrated superior performance to other scour depth models, particularly for CWS datasets. It has identified parameters such as the ratio of pier width to flow depth (b/y) and mean sediment size to flow depth (d50/y) as the most influential factors for accurately predicting scour depth in laboratory and field data. The hybrid model combining ANN with PSO has outperformed models utilizing Levenberg–Marquardt (ANN–LM) and firefly algorithm (ANN-FA) in forecasting equilibrium scour depth under CWS conditions. Furthermore, the ANN–-PSO hybrid model shows promising potential for predicting improved sediment removal effectiveness in discharge scenarios (Shariati et al. 2019). The adaptive neuro-fuzzy inference system model emerges as the superior predictive model for both CWS and LBS conditions compared to the gene expression programming model and previous empirical scour depth prediction models (Choudhary et al. 2023). A scour depth prediction model for CWS and LBS conditions around bridge piers has been developed using support vector machine (SVM) techniques, yielding satisfactory results (Baranwal & Das 2023). Eini et al. (2024) model the scour depth by combining Bayesian optimization (BO) with SVM and eXtreme Gradient Boosting (XGBoost) and claimed that BO–XGBoost outperformed based on lower root mean square error (RMSE), mean absolute percentage error (MAPE), and higher R2 values.

A comprehensive literature review highlighted the effectiveness of XGBoost and SVM–PSO machine learning (ML) approaches, which have been utilized in the different domains of water resource engineering and demonstrated exceptional performance in predicting scour depth, outperforming other ML models (Tao et al. 2021; Chang et al. 2023; Kumar et al. 2024a, b). The temporal scour depth modelling literature indicates that only a relatively limited range of datasets has been used to develop prediction models under CWS conditions. Moreover, an accurate empirical temporal CWS-based scour depth prediction model is lacking. Additionally, no ML model can yield a better scour depth ratio (SDR) for the chosen datasets. Many ML models are adequate for laboratory experimental datasets, yet they often underperform in field conditions. Consequently, this research addresses a notable gap in the existing literature by employing the XGBoost and SVM–PSO ML techniques to predict scour depth in CWS conditions. Selecting a boosting model, such as XGBoost, is advantageous because it can model complex, non-linear relationships without the need for predefined interactions. The SVM–PSO model can reduce overfitting and enhance prediction accuracy. Despite the substantial implications for hydraulic engineering, there has been limited exploration of utilizing XGBoost and SVM–PSO ML techniques for predicting scour depth under CWS conditions. By employing both ML techniques, this research provides a more reliable, robust, and flexible model for predicting temporal scour depth under clear water scouring.

The present study aims to fill all these gaps, providing crucial insights to estimate the temporal SDR in CWS conditions. The scarcity of research in this domain underscores the significance of the present investigation, which contributes significantly to the understanding and prediction of scour depth in field conditions. This study outlines the crucial parameters related to geometry, roughness, and flow characteristics. It collects a wide range of previous experimental and field temporal-based datasets to develop a scour depth under the CWS condition prediction model using XGBoost and SVM–PSO ML techniques. The gamma test (GT) has been performed to identify the best input parameter combination. The efficacy of both ML techniques has been assessed using existing empirical models, and the performance has been compared. A recommendation is to utilize a better ML model capable of predicting temporal scour depth under CWS conditions by effectively managing a wide range of input parameters to mitigate the risk of bridge pier failure, potentially saving lives and reducing financial burdens on the country.

Selecting significant input parameters under temporal conditions for modelling scour depth

The important influencing parameters to estimate the scour depth (ds) over time (t) are pier width (b) and mean sediment size (d50), considering clear water scouring (CWS), flow velocity (V), and flow depth (y) (Lanca et al. 2013; Oğuz & Bor 2021). Various input parameters of the flowing water, sediment composition, and pier characteristics collectively influence the local scour depth, as presented in Equation (1). The ensuing effective relationship delineates the scouring depth:
(1)
where b is the pier width, ρ is the fluid density, μ is the fluid dynamic viscosity, g is the gravitational acceleration, d50 is the mean sediment size, σg is the geometric standard deviation of sediment, y is the flow depth, V is the flow velocity, Vc is the sediment incipient velocity, Fr is the Froude number, and t is time. The non-dimensional features were grouped to grow an improved scour depth predictive model (Oliveto & Hager 2002; Oğuz & Bor 2021; Kumar et al. 2023a, b; Baranwal & Das 2023). SDR (ds/y) was denoted by the subsequent in terms of non-dimensional input parameters:
(2)

Dataset collection

Collecting the field or experimental datasets to develop ML approaches such as XGBoost and SVM–PSO-based models to predict local scour depth is essential. However, no experimental study has been performed to collect datasets. Instead, only previously available datasets were collected from various sources, as shown in Table 1. The experimental data were obtained mainly for clear water scouring (CWS) conditions, i.e., when the ratio of approach mean velocity to critical velocity (V/Vc) is less than or equal to 1.0. In the present study, a total of 501 datasets were collected from previous studies, of which 75% were utilized for training purposes and the remaining 25% for testing purposes. The XGBoost and SVM–PSO ML models utilized five non-dimensional input parameters (b/y, V/Vc, Fr, d50/b, σg, and Vt/b) and one output parameter known as SDR, represented as ds/y.

Table 1

Various clear water scouring (CWS) parameters collected from previous publications

Authors and yearsb/yV/Vc Frd50/bσgVt/bds/y
Yanmaz & Altinbilek (1991)  0.280–1.480 0.430–0.770 0.250–0.290 0.010–0.020 1.130–1.280 3.00–6.00 0.136–0.231 
Melville & Chiew (1999)  0.050–5.080 0.400–1.000 0.120–0.880 0.004–0.080 1.010–3.150 3.33–250.50 0.051–0.576 
Mia & Nago (2003)  0.140–1.500 0.700–0.900 0.120–0.400 0.002–0.060 1.200–1.290 2.33–313.63 0.080–0.317 
Chang et al. (2004)  5.000–6.666 0.805–0.845 0.592–0.632 0.007–0.010 1.200–3.000 19.00–56.00 0.202–0.950 
Molinas (2004)  0.089–1.275 0.205–0.999 0.110–0.724 0.003–0.095 1.150–3.700 8.00–30.00 0.005–0.214 
Sheppard et al. (2004)  0.086–5.352 0.609–0.970 0.071–0.387 0.001–0.007 1.210–1.510 18.83–616.00 0.066–0.642 
Grimaldi (2005)  0.360–0.500 1.000–1.000 0.210–0.330 0.005–0.017 1.440–1.460 96.00–148.35 0.131–0.187 
Lanca et al. (2013)  0.200–2.000 0.278–0.347 0.259–0.317 0.002–0.017 1.360–1.360 168.00–330.00 0.121–0.549 
López et al. (2014)  0.250–0.810 0.500–0.860 0.165–0.336 0.017–0.018 1.320 2.00–145.20 0.080–0.189 
Fael et al. (2016)  0.250–1.000 0.960 0.220 0.004–0.017 1.360 150.00–316.32 0.143–0.348 
Aksoy et al. (2017)  0.213–0.227 0.450–0.560 0.259–0.326 0.017–0.086 1.3900. 6.68 0.075–0.149 
Pandey et al. (2020)  0.550–0.700 0.640–0.900 0.138–0.203 0.003–0.004 1.220 13.53–17.50 0.106–0.230 
Yang et al. (2020)  0.135–0.477 0.530 0.099–0.099 0.004–0.013 1.300 0.07–12.41 0.057–0.105 
Authors and yearsb/yV/Vc Frd50/bσgVt/bds/y
Yanmaz & Altinbilek (1991)  0.280–1.480 0.430–0.770 0.250–0.290 0.010–0.020 1.130–1.280 3.00–6.00 0.136–0.231 
Melville & Chiew (1999)  0.050–5.080 0.400–1.000 0.120–0.880 0.004–0.080 1.010–3.150 3.33–250.50 0.051–0.576 
Mia & Nago (2003)  0.140–1.500 0.700–0.900 0.120–0.400 0.002–0.060 1.200–1.290 2.33–313.63 0.080–0.317 
Chang et al. (2004)  5.000–6.666 0.805–0.845 0.592–0.632 0.007–0.010 1.200–3.000 19.00–56.00 0.202–0.950 
Molinas (2004)  0.089–1.275 0.205–0.999 0.110–0.724 0.003–0.095 1.150–3.700 8.00–30.00 0.005–0.214 
Sheppard et al. (2004)  0.086–5.352 0.609–0.970 0.071–0.387 0.001–0.007 1.210–1.510 18.83–616.00 0.066–0.642 
Grimaldi (2005)  0.360–0.500 1.000–1.000 0.210–0.330 0.005–0.017 1.440–1.460 96.00–148.35 0.131–0.187 
Lanca et al. (2013)  0.200–2.000 0.278–0.347 0.259–0.317 0.002–0.017 1.360–1.360 168.00–330.00 0.121–0.549 
López et al. (2014)  0.250–0.810 0.500–0.860 0.165–0.336 0.017–0.018 1.320 2.00–145.20 0.080–0.189 
Fael et al. (2016)  0.250–1.000 0.960 0.220 0.004–0.017 1.360 150.00–316.32 0.143–0.348 
Aksoy et al. (2017)  0.213–0.227 0.450–0.560 0.259–0.326 0.017–0.086 1.3900. 6.68 0.075–0.149 
Pandey et al. (2020)  0.550–0.700 0.640–0.900 0.138–0.203 0.003–0.004 1.220 13.53–17.50 0.106–0.230 
Yang et al. (2020)  0.135–0.477 0.530 0.099–0.099 0.004–0.013 1.300 0.07–12.41 0.057–0.105 

Different temporal clear water scour depth predictive equations

The temporal variation of scour depth around the bridge pier was assessed through numerical simulations and experimental investigations, focusing predominantly on laboratory-based studies. Numerous scholars and researchers have developed formulas to estimate time-based scour depth around rounded bridge piers, considering a wide range of flow characteristics and bed compositions (Oliveto & Hager 2002; Lanca et al. 2013; Franzetti et al. 2022; Tang et al. 2023). All these empirical SDR predictive models are extensively utilized to estimate scour depth around bridge piers under temporal conditions. Oliveto & Hager (2002) proposed an empirical equation for time-dependent or variable flow conditions, indicating scour depth variations over time. Lanca et al. (2013) formulated a scour depth prediction model incorporating flow depth (y), pier width (b), and mean sediment size (d50), considering various ratios of y/b and d50/b. Similarly, Franzetti et al. (2022) and Tang et al. (2023) presented scour prediction models based on y, b, flow velocity (V), and d50. The feature selection of the empirical models highlights key input parameter variables, such as geometrical parameter (b), flow parameter (y, V), bed roughness parameter (d50, σg,Vc, Fr), and temporal parameter (t) essential for feature selection in ML (Kumar et al. 2024a, b). Understanding scour dynamics from these empirical models informs the conceptual framework for input–output relationships in XGBoost and SVM–PSO approaches (Shafagh Loron et al. 2023). While empirical models provide a base, their limitations necessitate integrating advanced data-driven techniques to enhance predictive accuracy and adapt to temporal changes in scour depth. Input–output conceptualization relationships are also established in the empirical equations that guided understanding how various input factors interact to influence scour depth. This conceptual framework helped to design and present XGBoost and SVM–PSO ML models to predict scour depth more accurately.

Oliveto & Hager (2002) model:
(3)
where Fd = (V/(gd50)0.5), zR = (yb2)1/3, tR = zR/(σ1/3gd50)0.5, T = t/tRLanca et al.'s (2013) model:
(4a)
(4b)
(4c)
(4d)
(4e)
(4f)
(4g)
Tang et al.’s (2023) model:
(6)
where , Δ = ρsρ/ρ, and .

All four empirical models utilize sediment transport theories, mainly focusing on the volumetric rate of sediment transport and critical bed-shear stress acting on sediment particles to establish empirical equation relationships for scour depth over time (Mohammadpour et al. 2017; Wang et al. 2024). While Oliveto & Hager (2002) are primarily empirical, relying on laboratory data to derive relationships between variables, often resulting in straightforward models. The model may not perform well under extreme conditions or with highly variable sediment compositions, leading to potential inaccuracies in scour depth predictions. It depends on empirical constants that may limit its adaptability to diverse environmental conditions. Lanca et al.’s (2013) model emphasized the significance of flow depth and sediment size ratios in their predictive equations and provided accurate scour predictions by including different equations based on the ratio of b/d50, allowing for flexibility in application across various sediment sizes. The model tends to under-predict scour depth for specific ranges, particularly when b/d50 values exceed 500.0, as indicated by the present study, and its complexity may complicate implementation in practical scenarios without adequate calibration. Lanca et al. (2013) and Franzetti et al. (2022) adopt a semi-analytical approach, such as combining empirical fitting with fundamental physical principles, addressing scale effects and improving long-term prediction accuracy (Cui et al. 2019; Wang et al. 2024), and these models improve long-term predictions by integrating empirical data with theoretical frameworks, thus addressing uncertainties inherent in purely empirical models (Nandi & Das 2023). The limitation of Franzetti et al. (2022) is that this model under-predicts scour depth in the range of 0.3–2.4, suggesting limitations in its applicability to the present range of datasets. The complexity of the model may hinder its usability in real-time applications without significant computational resources.

Tang et al.’s (2023) model introduces a fresh perspective by integrating sediment density differences and flow characteristics, which may improve predictions in specific scenarios. It has the potential to capture the effects of varying sediment compositions and flow conditions. For the present range of datasets, this empirical model revealed that it often over-predicts scour depth in the range of 0.2–1.2. The dependency of the model on multiple parameters may complicate its calibration and application in diverse domains of scour depth prediction. Franzetti et al. (2022) and Tang et al. (2023) refined their models by incorporating flow velocity and sediment gradation, enhancing predictive accuracy. The overall limitation of all empirical scour depth prediction models is that they may lack the robustness needed for long-term predictions, potentially leading to underestimations of scour depth in real-world applications. Despite advancements, challenges remain in accurately predicting scour depth due to the complex interactions of hydraulic phenomena and sediment transport, necessitating ongoing research and model refinement (Tang et al. 2023). The effectiveness of empirical approaches is often compromised by the diversity of datasets gathered from various researchers, resulting in inconsistencies and potential differences. Furthermore, the restricted applicability of empirical formulas within certain parameters impedes accurate prediction and broader application.

The present research employed previously published datasets to build XGBoost and SVM–PSO models for estimating time-dependent scour depth around the bridge pier. Initially, 75% of the datasets were employed for model training, followed by evaluating model performance using statistical indices such as MAPE, RMSE, and R2 with the remaining 25% of the datasets. A total of 501 datasets were collected under the clear water scour (CWS) condition. Figure 1 illustrates a detailed overview of the current research.
Figure 1

A flow chart illustrating the present research process.

Figure 1

A flow chart illustrating the present research process.

Close modal

Gamma test

The best range of input dataset combinations for the XGBoost and SVM–PSO ML models were selected using the GT. Total collected datasets can be utilized in a non-linear demonstrating arrangement, with the GT results consequent by the alternative term V-ratio, which returns a climbed invariant demurral estimation between 0 and 1 (Agalbjorn et al. 1997; Kumar et al. 2023a). The GT is used to find the four finest amalgamations of input constraints. The V-ratio is defined as follows:
(7)
where σ2 (y) is the yield variance y, the V-ratio proves advantageous when comparing data from diverse sources as it remains robust despite significant yield variances. Specifically, the actual yield is represented as an even model estimate and remains stable when the V-ratio approaches zero. The GT, a vital process component, can be conducted using the Win-Gamma software developed by Durrant (2001).

K-fold cross-validation techniques

The K-fold cross-validation technique is commonly employed to address overfitting issues in ML and evaluate the generalization capabilities of various algorithms (Saha et al. 2021 and Yoon 2021). This approach segments the output data into k portions or folds (Pal & Patel 2020), with one fold serving as the test set and the remaining folds used for model training. Overfitting is a prevalent issue in computational modelling (Asteris et al. 2021), where an ML model may accurately predict outcomes for a specific dataset during training and testing but produce highly unusual results when applied to datasets from different experimental conditions. Consequently, validating predictive models using additional datasets is essential to ensure their reliability (Bardhan et al. 2021). For this purpose, four-fold cross-validation (Kumar et al. 2024a, b) was implemented to identify the most suitable testing datasets. This process divided the main dataset into four equal segments (CV-1–CV-4) (Hadavimoghaddam et al. 2021). One subset was designated as the testing dataset. In contrast, the remaining subsets were used as the training dataset to minimize the risk of overfitting to a particular training set and maximize the utilization of the entire dataset. The primary goal of the cross-validation was to optimize the model construction for enhanced prediction accuracy during the validation stage.

Normalization of datasets

In general, the datasets must be regularized once the model datasets are distributed, and the model extent is significant to reduce the data extent to build the model for the prediction. All sample datasets are normalized to fit them in the interval of 0 and 1 via the subsequent direct mapping method (Hsu et al. 1995) to improve prediction accuracy and smooth the training procedure.
(8)

It has been utilized to improve the accuracy and swiftness of the ML approaches to convey input and output data between 0.05 and 0.95. Where anorm is the normalized input range, a is the original input range, amin is the minimum input range, and amax is the maximum input range.

XGBoost ML approaches

The concept of boosting gained prominence through scrutiny into whether a ‘weak learner’ could be improved using adjustments akin to enhancing a ‘poor hypothesis’ into a ‘good hypothesis.’ The process of boosting involves refining explanations and focusing on observations the weak learner struggles with. XGBoost incorporates gradient boosting at its core. Additionally, XGBoost features automated handling of missing data, partition construction for parallelized tree creation, and continuous training to refine models further. XGBoost has demonstrated efficacy in various domains, such as catalogue regression and predictive modelling tasks, making it a robust and scalable machine-learning method proposed by Chen & Guestrin (2016). Notably, XGBoost techniques have shown promising performance in challenging tasks related to speed, memory usage, scalability, and hardware constraints. These include applications in hydraulic applications such as aquatic transmission schemes, flash flood risk assessment, groundwater level estimation, and flood prediction. While XGBoost can model a range of aquatic challenges, optimizing its performance requires adjusting hyperparameters, which traditionally involve trial and error and can be time-consuming and cumbersome. Figure 2 outlines the step-by-step procedure for predicting local scour depth using XGBoost techniques. Enhancing the XGBoost model involves augmenting model generality by adjusting a set of penalty functions using Taylor expansions. Adding a regularization term to the objective function in its gradient-boosting-based system helps reduce model complexity and mitigate overfitting. The objective function is formulated as follows:
(9)
(10)
where yi is the projected value, Ω(fk) is the steady term, fk is the verdict tree, T denotes the number of leaf lumps, ω denotes the percentage of leaf lumps, γ controls the number of leaf lumps, and also λ controls the percentage of leaf lumps. The XGBoost system executes an iterative aspect process as well as a second-order Taylor expansion through the resolution of the neutral function, as given away:
(11)
where Equations (9) and (11) are the first-order and second-order derivatives of the loss function separately.
(12)
(13)
Figure 2

A flow chart of the local scour depth prediction model based on XGBoost ML approaches.

Figure 2

A flow chart of the local scour depth prediction model based on XGBoost ML approaches.

Close modal

SVM–PSO ML approaches

PSO was primarily projected by Kennedy & Eberhart (1995). In PSO, each constituent part (swarm) utilizes its separate retention and erudition elected up by the swarm inclusive to trace the finest arrangement. Each constituent part has the best interchange resolution, which is evaluated by the respectable bulk to be improved and has swiftness, which organizes the growth of the constituent part. The finest location of each constituent part is reached on its personal and neighbouring constituent part participation through the time consumed in the programme of the constituent part. Every constituent part is rationalized for each period with the next two ‘best’ values termed as pbest and gbest. A PSO–SVM procedure is used in dissimilar difficulties (Harish et al. 2015) as the numerous compensations of PSO declared, such as that it is easy to contrivance related to extra algorithms, such as genomic algorithms where an irritated alteration must be made. The mistake globalization bulk of PSO–SVM is well related to SVM, neural networks, and fuzzy implication systems. PSO is a robust and operative procedure that reproduces the natural behaviour of a flock of birds, moving and finding their food. Each constituent part or birdie has its locations and speeds, uninterruptedly rationalized based on their detachments to the finest position. The following formulations give the general equation for PSO.
(14)
(15)
where xi is the location of every constituent part, vi is the speed of every constituent part, pbest is the constituent part best value, gbest is the universal best value, r1 and r2 are arbitrary values between 0 and 1, whereas c1 and c2 are speeding up factors.
Figure 3 shows the general procedure of the projected combination of SVM–PSO. The input constraints are first made from finite element method (FEM) examination with an arbitrary lessening of component toughness inadequate from 30% to create the low-damage effort catalogue. These constraints are lively construction replies, counting regular occurrences. PSO is useful for searching the best input datasets, with the number of input topographies growing until the supreme amount of input constraints is reached. Each step creates a resident, and everyone is a dataset of arbitrary input countryside. These trials are uninterrupted until the supreme quantity of input constraints is reached. SVM estimates and determines the supreme dataset for the projected SVM–PSO model. By applying SVM–PSO, the vigorous pointed system can remove the clatter and superfluous countryside in the input constraints, so it is appropriate for little mutilation intensity complications.
Figure 3

A flow chart of the local scour depth prediction model based on SVM–PSO approaches.

Figure 3

A flow chart of the local scour depth prediction model based on SVM–PSO approaches.

Close modal

The XGBoost ML model effectively identifies influential variables, such as time scale and sediment characteristics, and its high performance in structured data and its ability to capture complex relationships through an ensemble of decision trees are crucial for accurate predictions (Kumar et al. 2024a, b). The SVM–PSO ML approach utilizes extensive datasets, and it was chosen for its robustness in high-dimensional spaces and enhanced parameter tuning via PSO, which enhances its prediction reliability by improving accuracy in complex scenarios (Nandi et al. 2024). Hence, while XGBoost and SVM–PSO provide advanced predictive capabilities, the reliance on data-driven models may overlook the physical principles governing scour processes, potentially leading to less interpretable results compared to hybrid models that integrate physics-based insights (Yousefpour & Wang 2024).

Statistical indices

MAPE measured the average amount of errors produced by developed models:
(16)
The RMSE is an important metric that shows the total amount of error that occurs during the modelling process:
(17)
where Pi is the projected rate, Oi is the practical rate, and n is the number of trials. The RMSE is the square root of the callous complete error (ds/y) valued by the present developed model.
Coefficient of determination (R2): A statistical measure checks how well a model predicts the outcome. A number between 0 and 1 indicates the variation in the dependent variable from the independent variable:
(18)

The performance of the two selected ML approaches, i.e., XGBoost and SVM–PSO, is compared to check their reliability and accuracy in predicting the temporal clear water scour depth.

Result obtained from the GT and k-fold cross-validation techniques

During the GT, a series of 26 experiments were conducted, and the top 10 experiments are outlined in Table 2. It is revealed that the combination of five distinct input parameters with the mask (110,111) yields a more effective model than other combinations, as evidenced by its notably lower values of gamma and Vratio, approaching zero compared to other combinations, as shown in Table 2. This configuration, comprising five distinct non-dimensional parameters, was employed in XGBoost and SVM–PSO methodologies to develop a temporal model for clear water scour depth. The evaluation of various temporal clear water scour depth models using the GT is presented in Appendix Table A1.

Table 2

Various combinations of input parameters related to CWS determined from the GT

S. No.Combination of input parametersMaskGamma valueGradientStandard errorV-ratioNearest neighbour
(b/y, V/Vc, d50/b, σg, Vt/b110,111 −0.0001 0.2663 0.0037 −0.0004 
(b/y, Fr, d50/b, σg, Vt/b101,111 −0.0115 0.4053 0.0083 −0.0461 10 
(b/y, V/Vc, Fr, σg111,010 −0.0118 0.4017 0.0092 −0.0469 10 
(b/y, V/Vc, d50/b, Vt/b110,101 0.0030 0.2440 0.0056 0.0120 10 
(b/y, V/Vc, Fr, σg, Vt/b111,011 0.0033 0.2026 0.0059 0.0132 
(b/y, V/Vc, d50/b, σg, Vt/b110,111 0.0038 0.1915 0.0034 0.0154 12 
(b/y, V/Vc, Fr, σg, Vt/b111,011 0.0050 0.1858 0.0035 0.0199 10 
(b/y, V/Vc, Fr, σg111,010 0.0052 0.2305 0.0056 0.0207 12 
(b/y, V/Vc, Fr111,000 0.0066 0.3233 0.0053 0.0264 
10 (b/y, V/Vc, d50/b, Vt/b110,101 0.0073 0.1949 0.0034 0.0290 12 
S. No.Combination of input parametersMaskGamma valueGradientStandard errorV-ratioNearest neighbour
(b/y, V/Vc, d50/b, σg, Vt/b110,111 −0.0001 0.2663 0.0037 −0.0004 
(b/y, Fr, d50/b, σg, Vt/b101,111 −0.0115 0.4053 0.0083 −0.0461 10 
(b/y, V/Vc, Fr, σg111,010 −0.0118 0.4017 0.0092 −0.0469 10 
(b/y, V/Vc, d50/b, Vt/b110,101 0.0030 0.2440 0.0056 0.0120 10 
(b/y, V/Vc, Fr, σg, Vt/b111,011 0.0033 0.2026 0.0059 0.0132 
(b/y, V/Vc, d50/b, σg, Vt/b110,111 0.0038 0.1915 0.0034 0.0154 12 
(b/y, V/Vc, Fr, σg, Vt/b111,011 0.0050 0.1858 0.0035 0.0199 10 
(b/y, V/Vc, Fr, σg111,010 0.0052 0.2305 0.0056 0.0207 12 
(b/y, V/Vc, Fr111,000 0.0066 0.3233 0.0053 0.0264 
10 (b/y, V/Vc, d50/b, Vt/b110,101 0.0073 0.1949 0.0034 0.0290 12 

Figure 4 showcases the R2 values for the four-fold cross-validation (CV-1 to CV-4) performance of XGBoost and SVM–PSO models. The dataset was divided into training and testing subsets, with 75% for training and 25% allocated for testing. The graph demonstrates that the XGBoost model exhibits less variation in R2 values across CV-1 to CV-4, consistently maintaining values above 0.95. This indicates that the XGBoost model was more effective in identifying the optimal testing dataset than the SVM–PSO model. Figure 4 visually represents the R2 values for both models across all four cross-validation folds.
Figure 4

A bar chart illustrating the outcomes of four-fold cross-validation (CV-1 to CV-4) (determined by R2 value).

Figure 4

A bar chart illustrating the outcomes of four-fold cross-validation (CV-1 to CV-4) (determined by R2 value).

Close modal

Performance of XGBoost ML approaches

In XGBoost modelling, a total of 501 datasets, 75% of the training dataset and 25% of the testing datasets are selected to model the scour depth, respectively. Six input parameter combinations, such as b/y, V/Vc, Fr, d50/b, σg, and t, were used to train and tested to model clear water scour depth under temporal conditions. To check the strength of the XGBoost ML approaches, three statical indices, MAPE, RMSE, and R2, are utilized for training and testing datasets. An error analysis was performed to select the best model using the XGBoost ML approaches. The outcomes for scour depth prediction using XGBoost ML approaches are presented in Table 3. Table 3 shows different variations of hyperparameters of learning rate and maximum depth with MAPE, RMSE, and R2 values for training and testing datasets. It is shown to optimize the hyperparameter of the XGBoost model at a learning rate of 0.1, and the maximum depth of 6.0 provides better results, as depicted in Table 3. The performance of various temporal clear water scour depth models using the XGBoost ML approaches is shown in Appendix Table A2.

Table 3

Training and testing datasets result for different parameters using (b/y, V/Vc, Fr, d50/b, σg, Vt/b) by the XGBoost model

S. No.XGBoost optimal parameter
Training data
Testing data
Learning rateMax depthMAPERMSER2MAPERMSER2
0.1 6.0 1.57 0.0051 0.95245 12.69 0.0294 0.96777 
0.4 7.0 1.34 0.0049 0.96192 12.84 0.0305 0.96533 
0.3 5.0 1.46 0.0050 0.94188 13.43 0.0305 0.96527 
0.4 9.0 1.10 0.0047 0.94138 13.72 0.0306 0.96520 
0.2 8.0 1.32 0.0049 0.95618 13.06 0.0308 0.96478 
0.1 8.0 1.45 0.0050 0.96176 13.41 0.0308 0.96464 
0.4 10.0 1.12 0.0047 0.95129 13.72 0.0310 0.96428 
0.4 6.0 1.25 0.0048 0.95168 13.85 0.0314 0.96336 
0.4 8.0 1.22 0.0047 0.93123 14.48 0.0324 0.96102 
10 0.4 5.0 1.37 0.0048 0.95616 16.09 0.0382 0.94577 
S. No.XGBoost optimal parameter
Training data
Testing data
Learning rateMax depthMAPERMSER2MAPERMSER2
0.1 6.0 1.57 0.0051 0.95245 12.69 0.0294 0.96777 
0.4 7.0 1.34 0.0049 0.96192 12.84 0.0305 0.96533 
0.3 5.0 1.46 0.0050 0.94188 13.43 0.0305 0.96527 
0.4 9.0 1.10 0.0047 0.94138 13.72 0.0306 0.96520 
0.2 8.0 1.32 0.0049 0.95618 13.06 0.0308 0.96478 
0.1 8.0 1.45 0.0050 0.96176 13.41 0.0308 0.96464 
0.4 10.0 1.12 0.0047 0.95129 13.72 0.0310 0.96428 
0.4 6.0 1.25 0.0048 0.95168 13.85 0.0314 0.96336 
0.4 8.0 1.22 0.0047 0.93123 14.48 0.0324 0.96102 
10 0.4 5.0 1.37 0.0048 0.95616 16.09 0.0382 0.94577 

Performance of SVM–PSO ML approaches

In SVM–PSO modelling, 75 and 25% of data are arbitrarily nominated to model scour depth for training and testing. For various optimal parameters of SVM–PSO techniques, swarm sizes C1 = cognitive factor and C2 = social factor are utilized, and their corresponding results are shown in Table 4. The sum of the values of C1 and C2 is 4.0. In Table 4, the best combination for the optimal parameter of SVM–PSO is observed at swarm size (C1) = 0.3, C2 = 3.7, P = 25.0, C = 132.0, γ = 3.7, ε = 0.6, d = 3.0. At this parameter, the values of statical indices such as MAPE, RMSE, and R2 are 17.070 (lower), 0.03414 (lower), and 0.94844 (higher), respectively, which provides good results among the other selected optimal parameter combinations. The performance of various temporal clear water scour depth models using the XGBoost ML approaches is mentioned in Appendix Table A3.

Table 4

Training and testing data results for different parameters using (b/y, V/Vc, Fr, d50/b, σg,Vt/b) by the SVM–PSO model

S. No.SVM–PSO optimal parameter
Training data
Testing data
C1C2PCγεdMAPERMSER2MAPERMSER2
0.3 3.7 25.0 132.0 3.7 0.6 3.0 7.231 0.02085 0.949848 17.070 0.03414 0.94844 
2.4 1.6 25.0 380.0 35.0 2.3 3.0 7.207 0.02081 0.949849 17.075 0.03416 0.94838 
0.8 3.6 25.0 48.0 39.0 2.6 3.0 7.110 0.02068 0.949853 17.149 0.03417 0.94833 
0.8 3.3 25.0 749.0 76.0 9.9 3.0 7.241 0.02087 0.949847 17.107 0.03418 0.94830 
3.4 0.6 25.0 225.0 47.0 3.4 3.0 7.072 0.02063 0.949854 17.153 0.03419 0.94827 
1.3 2.8 25.0 625.0 97.0 1.8 3.0 6.806 0.02024 0.949865 17.537 0.03479 0.94644 
1.8 2.3 25.0 375.0 24.0 9.7 3.0 9.300 0.02581 0.949642 17.176 0.03540 0.94456 
2.4 1.7 25.0 214.0 83.0 0.9 3.0 4.522 0.01748 0.949925 22.398 0.03990 0.92955 
1.0 3.0 25.0 250.0 3.0 0.3 3.0 4.525 0.01747 0.949925 22.550 0.03997 0.92931 
10 0.9 3.1 25.0 253.0 4.8 1.7 3.0 4.496 0.01746 0.949925 22.660 0.03998 0.92928 
S. No.SVM–PSO optimal parameter
Training data
Testing data
C1C2PCγεdMAPERMSER2MAPERMSER2
0.3 3.7 25.0 132.0 3.7 0.6 3.0 7.231 0.02085 0.949848 17.070 0.03414 0.94844 
2.4 1.6 25.0 380.0 35.0 2.3 3.0 7.207 0.02081 0.949849 17.075 0.03416 0.94838 
0.8 3.6 25.0 48.0 39.0 2.6 3.0 7.110 0.02068 0.949853 17.149 0.03417 0.94833 
0.8 3.3 25.0 749.0 76.0 9.9 3.0 7.241 0.02087 0.949847 17.107 0.03418 0.94830 
3.4 0.6 25.0 225.0 47.0 3.4 3.0 7.072 0.02063 0.949854 17.153 0.03419 0.94827 
1.3 2.8 25.0 625.0 97.0 1.8 3.0 6.806 0.02024 0.949865 17.537 0.03479 0.94644 
1.8 2.3 25.0 375.0 24.0 9.7 3.0 9.300 0.02581 0.949642 17.176 0.03540 0.94456 
2.4 1.7 25.0 214.0 83.0 0.9 3.0 4.522 0.01748 0.949925 22.398 0.03990 0.92955 
1.0 3.0 25.0 250.0 3.0 0.3 3.0 4.525 0.01747 0.949925 22.550 0.03997 0.92931 
10 0.9 3.1 25.0 253.0 4.8 1.7 3.0 4.496 0.01746 0.949925 22.660 0.03998 0.92928 

Note. C1, cognitive factor; C2, social factor; d, degree.

Sensitivity analysis of input parameters to model the temporal scour depth

The incorporation of sensitivity analysis shows how variations in input parameters (b/y, V/Vc, Fr, d50/b, Vt/b, σg) impact the output of the SDR (ds/y), especially concerning stability parameters crucial for system stability.

Conducting sensitivity analysis is crucial in this study as it aids in making informed decisions and enhances our understanding of model robustness. The output parameter under consideration is the SDR (ds/y), and the Cosine Amplitude Method (CAM) (Jitchaijaroen et al. 2024) is employed to assess the influence of input factors on the ds/y. Equation (19) details how this method is utilized to evaluate the sensitivity analysis of an input parameter by examining the correlation Rij between pairs of input and output datasets.
(19)
where dik represents the ith input parameter corresponding to the kth dataset, while pjk denotes the ith value of the output parameter, specifically the SDR (ds/y), for the kth dataset, with n representing the total number of datasets utilized. In accordance with the CAM, a Rij value nearing 1.0 signifies a strong correlation between input and output parameters, emphasizing its substantial impact on the final output across various proposed models such as XGBoost and SVM–PSO. The importance of the input parameters concerning both actual and model-predicted values is illustrated in Figure 5.
Figure 5

Significance of input parameters in predicting SDR (ds/y).

Figure 5

Significance of input parameters in predicting SDR (ds/y).

Close modal

The findings illustrated in Figure 5 indicate that the SDR (ds/y) is primarily influenced by b/y, Vt/b, Fr, d50/b, σg, and V/Vc, respectively, for both XGBoost and SVM–PSO models, as shown in Figure 5. The correlation coefficients (Rij) for the XGBoost model (b/y = 6.194, Vt/b = 3.312, Fr = 3.181, d50/b = 2.312, σg = 2.128, and V/Vc = 1.406) and SVM–PSO model (b/y = 6.242, Vt/b = 3.371, Fr = 3.225, d50/b = 2.350, σg = 2.150, and V/Vc = 1.426) are provided. The XGBoost model demonstrates superior performance compared to the SVM–PSO model.

Comparison of the present models with the previous scour depth predictive models

The scattered plots of observed SDR and predicted SDR of the XGBoost and SVM–PSO models are shown in Figure 6. The proposed XGBoost model performed well in predicting SDR values, as all the predicted SDR values lie near the best-fit line. For SDR values 0.3–1.5, the XGBoost model overpredicted a few SDR values; SDR values 1.8–2.4 under-predicted a few values, as shown in Figure 6. Furthermore, the SVM–PSO ML model over-predicated a few SDR values for a range of 0.4–1.2 and the under-predicted few values between 0.7 and 2.4 and the SVM–PSO model also predicted scour depth near the best-fit line for SDR values of 0.3–2.1, as shown in Figure 6.
Figure 6

Scatter plot of predicted vs. observed SDR for SVM–PSO (present model) and XGBoost (present model).

Figure 6

Scatter plot of predicted vs. observed SDR for SVM–PSO (present model) and XGBoost (present model).

Close modal
Figure 7 compares the predicted SDR value of presently developed XGBoost and SVM–PSO ML models with the SDR prediction value of four existing empirical equations of previous researchers. However, using modelling for all datasets and comparing existing empirical models present XGBoost and SVM–PSO ML models predicted better SDR values. The predicted SDR value of the XGBoost and SVM–PSO models lies near the best-fit line, meaning that the developed models are efficient in predicting the SDR value. The model by Oliveto & Hager (2002) predicted a better scour depth except for a few values in a range of 0.2–1.5 among previous empirical models. Lanca et al.’s (2013) model under-predicted an SDR value of 1.2–1.4, and the models by Franzetti et al. (2022) and Tang et al. (2023) under-predicted an SDR value of 0.3–2.4. Tang et al. (2023) also over-predicted SDR at the range of 0.2–1.2 for the present range of datasets, as shown in Table 1. Hence, the empirical models by Franzetti et al. (2022), Tang et al. (2023), and Lanca et al. (2013) are very poorly predicted empirical model for the present study, which may be because their model may be developed for the different range of datasets compared to the present selected datasets (Table 1). The poor performance of the empirical model is as follows: the model by Oliveto & Hager (2002) developed their equation by utilizing a mix of laboratory and field data with moderate to high flow velocities up to 5.0 m/s and a range of sediment sizes. Despite its broader applicability, this model may still be constrained by its dependence on a relatively limited dataset and its inability to thoroughly capture the complexities of scour processes under extreme conditions. Lanca et al. (2013) developed a model based on restricted flow velocity, such as 0.5–2.5 m/s, and sediment dimensions, such as fine to medium sand. As a result, this model may underperform when faced with higher flow velocities or larger sediment particles. Franzetti et al. (2022) formulated their equation for scour prediction in coastal areas, specifically considering wave action and tidal currents based on field observations from particular coastal sites. The limitations of this model become apparent when applied to river conditions or different sediment sizes. Tang et al. (2023) developed a model under different pier shapes and configurations that generally approach flow velocity up to 3.0 m/s. It may struggle to accurately represent the intricacies of scour under diverse field conditions, especially in larger-scale infrastructure scenarios.
Figure 7

Scatter plot of predicted vs. observed SDR estimated using the existing empirical equations of different researchers and the present developed model.

Figure 7

Scatter plot of predicted vs. observed SDR estimated using the existing empirical equations of different researchers and the present developed model.

Close modal

Comparison of range-wise statistical error analysis of presently developed model results with previous existing scour depth models

The statistical measures of error, MAPE, RMSE, and R2 are calculated for the present XGBoost and SVM–PSO ML models. These calculations are compared to four scour depth predictive models: Oliveto & Hager (2002), Lanca et al. (2013), Franzetti et al. (2022), and Tang et al. (2023) under different b/y, V/Vc, and Fr conditions. The evaluation of range-wise errors across input non-dimensional parameters such as b/y, V/Vc, and Fr allows for assessing the efficacy of the presently developed models, i.e., XGBoost and SVM–PSO, compared to other existing empirical equations.

To perform the range-wise error analysis, the following input parameters ranges b/y, V/Vc, and Fr are selected as follows:

  • (a) b/y ≤ 0.25, 0.25 < b/y ≤ 0.5, 0.5 < b/y ≤ 1.5, and b/y > 1.5

  • (b) V/Vc ≤ 0.25, 0.25 < V/Vc ≤ 0.5, 0.50 < V/Vc ≤ 0.75, and 0.75 < V/Vc ≤ 1.0

  • (c) Fr ≤ 0.2, 0.2 < F ≤0.4, 0.4 < Fr ≤ 0.6, and 0.6 < Fr ≤ 0.0

Tables 57 present the RMSE, MAPE, and R2 values for the XGBoost and SVM–PSO ML models, along with four scour depth prediction equations across various ranges of b/y, V/Vc, and Fr. For b/y ≤ 0.25, Tang et al. (2023) exhibit the highest MAPE value, while XGBoost and SVM–PSO (present models) show the lowest. Additionally, Tang et al. (2023) demonstrate the highest RMSE, followed by Lanca et al. (2013) and Franzetti et al. (2022), and the lowest R2 value. In the 0.25 < b/y ≤ 1.0 range, Tang et al. (2023) show higher MAPE and RMSE values, while, for b/y > 1.5, Franzetti et al. (2022) have the highest MAPE, followed by Tang et al. (2023). The XGBoost model developed in this study consistently outperforms other models across all b/y ranges, with lower MAPE and RMSE values and higher R2 values. Table 6 displays the performance metrics for the proposed XGBoost and SVM–PSO models, as well as the four scour depth prediction equations within specific V/Vc ranges. For V/Vc ≤ 0.25, Tang et al. (2023) exhibit high MAPE and RMSE values and a low R2 value. In the 0.25 < V/Vc ≤ 0.5 range, Lanca et al. (2023) and Franzetti et al. (2022) share the same MAPE value. The SVM–PSO model shows higher error values than the XGBoost model in all ranges except 0.5 < V/Vc < 0.7. For the remaining V/Vc ranges, Tang et al. (2023) demonstrate higher error values, while the XGBoost model shows lower error values. Table 7 reveals that for Fr ≤ 0.2, the XGBoost model (present model) has the lowest RMSE and highest R2 values. Furthermore, for the 0.4 < Fr ≤ 0.6 range, the SVM–PSO model (present model) exhibits the highest R2 value. For 0.6 < Fr ≤ 1.0, Franzetti et al. (2022) show the highest MAPE value among all models. Overall, the XGBoost model consistently demonstrates higher R2 values and lower MAPE and RMSE values, while Tang et al. (2023) show lower R2 values and higher MAPE and RMSE values. The study concludes that the XGBoost model demonstrates superior performances in both prediction accuracy and adaptability to varying flow and sediment conditions. In contrast, previous empirical models produce unsatisfactory outcomes with high errors and low R2 values.

Table 5

Calculation of the errors in various approaches for determining ds/y within the selected ranges of b/y

Different approachesb/y ≤ 0.250.25 < b/y ≤ 0.50.5 < b/y ≤ 1.5b/y > 1.5
XGBoost (present model) 22.23 13.52 15.18 14.35 
0.23 0.11 0.19 0.26 
0.88 0.89 0.94 0.92 
PSO–SVM (present model) 47.29 37.17 32.58 25.46 
0.34 0.42 0.40 0.57 
0.76 0.69 0.78 0.81 
Oliveto & Hager (2002)  25.01 101.25 112.98 95.25 
0.56 0.61 0.70 0.87 
0.50 0.34 0.21 0.32 
Lanca et al. (2013)  44.64 119.67 125.76 98.16 
0.66 0.81 0.76 0.89 
0.47 0.32 0.20 0.31 
Franzetti et al. (2022)  93.55 152.58 176.15 215.28 
0.40 0.51 0.75 1.03 
0.39 0.29 0.22 0.23 
Tang et al. (2023)  115.89 168.52 250.59 205.85 
1.14 1.58 2.65 1.98 
0.19 0.23 0.16 0.18 
Different approachesb/y ≤ 0.250.25 < b/y ≤ 0.50.5 < b/y ≤ 1.5b/y > 1.5
XGBoost (present model) 22.23 13.52 15.18 14.35 
0.23 0.11 0.19 0.26 
0.88 0.89 0.94 0.92 
PSO–SVM (present model) 47.29 37.17 32.58 25.46 
0.34 0.42 0.40 0.57 
0.76 0.69 0.78 0.81 
Oliveto & Hager (2002)  25.01 101.25 112.98 95.25 
0.56 0.61 0.70 0.87 
0.50 0.34 0.21 0.32 
Lanca et al. (2013)  44.64 119.67 125.76 98.16 
0.66 0.81 0.76 0.89 
0.47 0.32 0.20 0.31 
Franzetti et al. (2022)  93.55 152.58 176.15 215.28 
0.40 0.51 0.75 1.03 
0.39 0.29 0.22 0.23 
Tang et al. (2023)  115.89 168.52 250.59 205.85 
1.14 1.58 2.65 1.98 
0.19 0.23 0.16 0.18 

Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.

Table 6

Calculation of the errors in various approaches for determining ds/y within the selected ranges of V/Vc

Different approachesV/Vc ≤ 0.250.25 < V/Vc ≤ 0.500.50 < V/Vc ≤ 0.750.75 < V/Vc ≤ 1.0
XGBoost (present model) 26.68 10.82 12.14 11.48 
0.18 0.13 0.23 0.31 
0.86 0.91 0.93 0.82 
PSO–SVM (present model) 56.75 29.74 39.10 30.55 
0.27 0.50 0.32 0.68 
0.81 0.85 0.91 0.87 
Oliveto & Hager (2002)  30.01 81.00 135.58 114.30 
0.45 0.73 0.56 0.70 
0.60 0.27 0.25 0.26 
Lanca et al. (2013)  53.57 95.74 150.91 117.79 
0.79 0.65 0.61 1.07 
0.38 0.38 0.24 0.25 
Franzetti et al. (2022)  74.84 122.06 211.38 172.22 
0.52 0.61 0.60 1.24 
0.31 0.23 0.26 0.28 
Tang et al. (2023)  92.71 134.82 300.71 247.02 
1.37 1.90 3.18 2.38 
0.23 0.18 0.13 0.22 
Different approachesV/Vc ≤ 0.250.25 < V/Vc ≤ 0.500.50 < V/Vc ≤ 0.750.75 < V/Vc ≤ 1.0
XGBoost (present model) 26.68 10.82 12.14 11.48 
0.18 0.13 0.23 0.31 
0.86 0.91 0.93 0.82 
PSO–SVM (present model) 56.75 29.74 39.10 30.55 
0.27 0.50 0.32 0.68 
0.81 0.85 0.91 0.87 
Oliveto & Hager (2002)  30.01 81.00 135.58 114.30 
0.45 0.73 0.56 0.70 
0.60 0.27 0.25 0.26 
Lanca et al. (2013)  53.57 95.74 150.91 117.79 
0.79 0.65 0.61 1.07 
0.38 0.38 0.24 0.25 
Franzetti et al. (2022)  74.84 122.06 211.38 172.22 
0.52 0.61 0.60 1.24 
0.31 0.23 0.26 0.28 
Tang et al. (2023)  92.71 134.82 300.71 247.02 
1.37 1.90 3.18 2.38 
0.23 0.18 0.13 0.22 

Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.

Table 7

Calculation of the errors in various approaches for determining ds/y within the selected ranges of Fr

Different approachesFr ≤ 0.20.2 < Fr ≤ 0.40.4 < Fr ≤ 0.60.6 < Fr ≤ 1.0
XGBoost (present model) 18.45 17.17 19.28 18.22 
0.29 0.14 0.16 0.22 
0.88 0.94 0.89 0.89 
PSO–SVM (present model) 60.06 30.85 27.04 32.33 
0.28 0.53 0.51 0.47 
0.87 0.77 0.91 0.77 
Oliveto & Hager (2002)  31.76 84.04 93.77 120.97 
0.46 0.77 0.58 0.72 
0.42 0.28 0.27 0.27 
Lanca et al. (2013)  37.05 99.33 159.72 81.47 
0.55 1.03 0.63 1.13 
0.39 0.27 0.25 0.39 
Franzetti et al. (2022)  77.65 193.78 146.20 273.41 
0.33 0.65 0.62 0.85 
0.50 0.37 0.28 0.29 
Tang et al. (2023)  96.19 214.02 207.99 261.43 
1.45 1.31 2.20 2.51 
0.24 0.19 0.20 0.15 
Different approachesFr ≤ 0.20.2 < Fr ≤ 0.40.4 < Fr ≤ 0.60.6 < Fr ≤ 1.0
XGBoost (present model) 18.45 17.17 19.28 18.22 
0.29 0.14 0.16 0.22 
0.88 0.94 0.89 0.89 
PSO–SVM (present model) 60.06 30.85 27.04 32.33 
0.28 0.53 0.51 0.47 
0.87 0.77 0.91 0.77 
Oliveto & Hager (2002)  31.76 84.04 93.77 120.97 
0.46 0.77 0.58 0.72 
0.42 0.28 0.27 0.27 
Lanca et al. (2013)  37.05 99.33 159.72 81.47 
0.55 1.03 0.63 1.13 
0.39 0.27 0.25 0.39 
Franzetti et al. (2022)  77.65 193.78 146.20 273.41 
0.33 0.65 0.62 0.85 
0.50 0.37 0.28 0.29 
Tang et al. (2023)  96.19 214.02 207.99 261.43 
1.45 1.31 2.20 2.51 
0.24 0.19 0.20 0.15 

Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.

The correlation coefficient variations for the XGBoost and SVM–PSO ML models, in comparison to four empirical models, are depicted in Figure 8. Analysis shows that the XGBoost model outperforms other ML approaches in predictive accuracy. Its correlation coefficient for predicted SDR closely aligns with the observed SDR, with the SVM–PSO model following in performance. Oliveto & Hager (2002) demonstrate the highest correlation value among the empirical models. Conversely, Tang et al.’s (2023) empirical model exhibits the lowest correlation coefficient, indicating poor SDR prediction relative to observed values. The empirical models by Franzetti et al. (2022), Lanca et al. (2013), and Oliveto & Hager (2002), are shown in Figure 8. Consequently, the XGBoost model (present model) suggests satisfactory SDR prediction, while Tang et al.’s (2023) empirical model indicates inadequate SDR prediction for the dataset range under the present study.
Figure 8

A Taylor diagram illustrating the comparison of SDR predictions using XGBoost and SVM–PSO ML approaches and the previous empirical model.

Figure 8

A Taylor diagram illustrating the comparison of SDR predictions using XGBoost and SVM–PSO ML approaches and the previous empirical model.

Close modal
The thick black box in the violin diagram indicates the interquartile range, with the median represented by a central white dot. Whiskers show the 95% confidence interval, while the violin shape depicts the frequency distribution of values. Figure 9 depicts the predicted scour depth from four empirical models and two present developed ML approaches, i.e., XGBoost and SVM–PSO, against the observed SDR. The density curves in the violin plots for all prediction and observed models complement skewness, variability, and symmetry characteristics. A notable distinction in the distribution shape and amplitude between the observed and projected scour depth ratios (ds/y) is evident in Figure 9. Both XGBoost and SVM–PSO present developed models that provide precise SDR value predictions. The predicted SDR distribution for both developed models closely matches the observed SDR. Furthermore, the models by Franzetti et al. (2022) exhibit shorter interquartile ranges, followed by Oliveto & Hager (2002). The empirical model by Franzetti et al. (2022) yields a lower median SDR prediction, while the models by Oliveto & Hager (2002), Lanca et al. (2013), and Tang et al. (2023) are approximately near the median value. The maximum distribution of datasets predicted by Tang et al. (2023) lies up to 1.0, which is lower among all the models. This shows the poor prediction capability for the present range of datasets, as depicted in Figure 9.
Figure 9

A violin plot depicting the distribution of SDR values across various scour depth prediction models.

Figure 9

A violin plot depicting the distribution of SDR values across various scour depth prediction models.

Close modal

In the present study, 501 datasets for clear water scouring (CWS) under temporal conditions were collected, and 75%, i.e., 325 datasets, and 25%, i.e., 126 datasets, were divided for training and testing purposes, respectively. The following conclusion has been drawn from the present study:

  • The GT shows that the highly significant input parameters to predict the SDR (ds/y) around the bridge pier are pier b, y, V, Vc, d50, σg, and t so it is recommended to use these input parameters during modelling of temporal clear water scour depth under a given range of datasets.

  • For CWS, it was found that XGBoost (present model) predicted better the SDR (ds/y) than other combinations by showing an R2 value of more than 0.96 and MAPE value of less than 13.0% and RMSE less than 0.30 than SVM–PSO (present model), hence it was concluded that the XGBoost ML model better predicted SDR under clear water scouring conditions for unsteady flow.

  • For existing previous empirical models under the CWS condition, it is found that the empirical model by Oliveto & Hager (2002) predicted better SDR values up to 1.6 than the models by Lanca et al. (2013), Franzetti et al. (2022), and Tang et al. (2023). Hence, the empirical model by Oliveto & Hager (2002) can be utilized to predict SDR value under temporal condition when the SDR value is up to 1.6.

  • For all the 0.5 ≤ /y ≤ 1.5, 0.5 < V/Vc < 0.7, and 0.2 < Fr ≤ 0.4 ranges, the XGBoost ML approaches typically depicted better results for selected dataset ranges than the SVM–PSO model and the other existing scour depth predictive models, along with a showing higher R2 value, i.e., 0.82 and lower MAPE and RMSE value, i.e., 26.68 and 0.31%, respectively.

  • The effectiveness of empirical approaches is often compromised by the diversity of datasets gathered from various researchers, resulting in inconsistencies and potential differences. Furthermore, the restricted applicability of empirical formulas within certain parameters impedes accurate prediction and broader application.

It is important to note that the present ML approach is limited by the diversity of datasets utilized in the modelling process. Improved results can be achieved when the input parameter values align with the specified ranges in Table 1. In order to improve the reliability of the model, a wider range of field-based temporal clear water scouring datasets is essential to collect, considering the influence of different non-circular bridge pier shapes.

All relevant data are included in the paper or its Supplementary Information.

The authors declare that there are no conflicts of interest.

Agalbjorn
S.
,
Koncar
N.
&
Jones
A. J.
(
1997
)
A note on the gamma test
,
Neural Comput. Appl.
,
5
(
3
),
131
133
.
Aksoy
A. O.
,
Bombar
G.
,
Arkis
T.
&
Guney
M. S.
(
2017
)
Study of the time-dependent clear water scour around circular bridge piers
,
J. Hydrol. Hydromech.
,
65
(
1
),
26
34
.
https://doi.org/10.1515/johh-2016-0048
.
Asteris
P. G.
,
Skentou
A. D.
,
Bardhan
A.
,
Samui
P.
&
Pilakoutas
K.
(
2021
)
Predicting concrete compressive strength using hybrid ensembling of surrogate machine learning models
,
Cem. Concr. Res.
,
145
,
106449
.
Baranwal
A.
&
Das
B. S.
(
2023
)
Clear-water and live-bed scour depth modelling around bridge pier using support vector machine
,
Can. J. Civ. Eng.
,
50
(
6
),
445
463
.
https://doi.org/10.1139/cjce-2022-0237
.
Baranwal
A.
,
Shankar Das
B.
&
Setia
B.
(
2023
)
A comparative study of scour around various shaped bridge pier
,
Eng. Res. Express
,
5
,
1
.
https://doi.org/10.1088/2631-8695/acbfa1
.
Bardhan
A.
,
Samui
P.
,
Ghosh
K.
,
Gandomi
A. H.
&
Bhattacharyya
S.
(
2021
)
ELM-based adaptive neuro swarm intelligence techniques for predicting the California bearing ratio of soils in soaked conditions
,
Appl. Soft Comput.
,
110
,
107595
.
Bateni
S. M.
,
Borghei
S. M.
&
Jeng
D. S.
(
2007
)
Neural network and neuro-fuzzy assessments for scour depth around bridge piers
,
Eng. Appl. Artif. Intell.
,
20
(
3
),
401
414
.
https://doi.org/10.1016/j.engappai.2006.06.012.
Chang
W. Y.
,
Lai
J. S.
&
Yen
C. L.
(
2004
)
Evolution of scour depth at circular bridge piers
,
J. Hydraul. Eng.
,
130
(
9
),
905
913
.
https://doi.org/10.1061/(asce)0733-9429(2004)130:9(905)
.
Chen
T.
&
Guestrin
C.
(
2016
) ‘
Xgboost: A scalable tree boosting system
’,
Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining
, pp.
785
794
.
Choudhary
A.
,
Das
B. S.
,
Devi
K.
&
Khuntia
J. R.
(
2023
)
ANFIS- and GEP-based model for prediction of scour depth around bridge pier in clear-water scouring and live-bed scouring conditions
,
J. Hydroinform.
,
25
(
3
),
1004
1028
.
https://doi.org/10.2166/hydro.2023.212
.
Cui
Y.
,
Lam
W. H.
,
Zhang
T.
,
Sun
C.
,
Robinson
D.
&
Hamill
G.
(
2019
)
Temporal model for ship twin-propeller jet induced sandbed scour
,
J. Mar. Sci. Eng.
,
7
(
10
),
339
.
Durrant
P. J.
(
2001
)
winGamma: A Non-Linear Data Analysis and Modelling Tool with Applications to Flood Prediction
.
Wales, UK
:
Cardiff University
.
Etemad-Shahidi
A.
,
Bonakdar
L.
&
Jeng
D. S.
(
2015
)
Estimation of scour depth around circular piers: Applications of model tree
,
J. Hydroinform.
,
17
(
2
),
226
238
.
https://doi.org/10.2166/hydro.2014.151
.
Ettema
R.
,
Mostafa
E. A.
,
Melville
B. W.
&
Yassin
A. A.
(
1998
)
Local scour at skewed piers
,
J. Hydraul. Eng. - ASCE
,
124
(
7
),
756
759
.
Fael
C.
,
Lanca
R.
&
Cardoso
A.
(
2016
)
Effect of pier shape and pier alignment on the equilibrium scour depth at single piers
,
Int. J. Sediment Res.
,
31
(
3
),
244
250
.
https://doi.org/10.1016/j.ijsrc.2016.04.001
.
Firat
M.
(
2009
)
Scour depth prediction at bridge piers by anfis approach
,
Proc. Inst. Civ. Eng. Water Manag.
,
162
(
4
),
279
288
.
https://doi.org/10.1680/wama.2009.00061
.
Firat
M.
&
Gungor
M.
(
2009
)
Generalized regression neural networks and feed forward neural networks for prediction of scour depth around bridge piers
,
Adv. Eng. Softw.
,
40
(
8
),
731
737
.
https://doi.org/10.1016/j.advengsoft.2008.12.001
.
Franzetti
S.
,
Radice
A.
,
Rebai
D.
&
Ballio
F.
(
2022
)
Clear water scour at circular piers: A new formula fitting laboratory data with less than 25% deviation
,
J. Hydraul. Eng.
,
148
(
10
),
1
13
.
https://doi.org/10.1061/(asce)hy.1943-7900.0002009
.
Grimaldi
C.
(
2005
)
Non-conventional Countermeasures Against Local Scouring at Bridge Piers
.
Cosenza, Italy
:
D. Universita'della Calabria
.
Hadavimoghaddam
F.
,
Ostadhassan
M.
,
Sadri
M. A.
,
Bondarenko
T.
,
Chebyshev
I.
&
Semnani
A.
(
2021
)
Prediction of water saturation from well log data by machine learning algorithms: Boosting and super learner
,
J. Mar. Sci. Eng.
,
9
(
6
),
666
.
Harish
N.
,
Mandal
S.
,
Rao
S.
&
Patil
S. G.
(
2015
)
Particle swarm optimization based support vector machine for damage level prediction of non-reshaped berm breakwater
,
Appl. Soft Comput.
,
27
,
313
321
.
https://doi.org/10.1016/j.asoc.2014.10.041
.
Hsu
K. L.
,
Gupta
H. V.
&
Sorooshian
S.
(
1995
)
Artificial neural network modeling of the rainfall-runoff process
,
Water Resources Research
,
31
(
10
),
2517
2530
.
Jitchaijaroen
W.
,
Keawsawasvong
S.
,
Wipulanusat
W.
,
Kumar
D. R.
,
Jamsawang
P.
&
Sunkpho
J.
(
2024
)
Machine learning approaches for stability prediction of rectangular tunnels in natural clays based on MLP and RBF neural networks
,
Intell. Syst. Appl.
,
21
,
200329
.
Kennedy
J.
&
Eberhart
R.
(
1995
)
Particle swarm optimization (PSO)
,
Proc. IEEE Int. Conf. Neural Networks, Perth, Aust.
,
4
(
1
),
1942
1948
.
Kothyari
U. C.
,
Garde
R. J.
&
Ranga Raju
K. G.
(
1992
)
Temporal variation of scour around circular bridge piers
,
J. Hydraul. Eng.
,
16
(
8
),
35
48
.
https://doi.org/10.1080/09715010.2010.10515014
.
Kumar
A.
,
Baranwal
A.
&
Das
B. S.
(
2023a
)
Modelling of clear water scour depth around bridge piers using M5 tree and ANN-PSO. AQUA – Water infrastructure
,
Ecosyst. Soc.
,
72
(
8
),
1386
1403
.
https://doi.org/10.2166/aqua.2023.225
.
Kumar
V.
,
Baranwal
A.
&
Das
B. S.
(
2023b
)
Prediction of local scour depth around bridge piers: Modelling based on machine learning approaches
,
Eng. Res. Express
,
6
(
1
),
015009
.
https://doi.org/10.1088/2631-8695/ad08ff
.
Kumar
M.
,
Samui
P.
,
Kumar
D. R.
&
Asteris
P. G.
(
2024a
)
State-of-the-art XGBoost, RF and DNN based soft-computing models for PGPN piles
,
Geomech. Geoengin.
,
19
(
6
),
975
990
.
Kumar
S.
,
Oliveto
G.
,
Deshpande
V.
,
Agarwal
M.
&
Rathnayake
U.
(
2024b
)
Forecasting of time-dependent scour depth based on bagging and boosting machine learning approaches
,
J. Hydroinform.
,
26
(
8
),
1906
1928
.
Lanca
R. M.
,
Fael
C. S.
,
Maia
R. J.
,
Pêgo
J. P.
&
Cardoso
A. H.
(
2013
)
Clear-water scour at comparatively large cylindrical piers
,
J. Hydraul. Eng.
,
139
(
11
),
1117
1125
.
https://doi.org/10.1061/(ASCE)HY.1943
.
Lee
S. O.
&
Sturm
T. W.
(
2009
)
Effect of sediment size scaling on physical modeling of bridge pier scour
,
J. Hydraul. Eng.
,
135
(
10
),
793
802
.
https://doi.org/10.1061/(asce)hy.1943-7900.0000091
.
López
G.
,
Teixeira
L.
,
Ortega-Sánchez
M.
&
Simarro
G.
(
2014
)
Estimating final scour depth under clear-water flood waves
,
J. Hydraul. Eng.
,
140
(
3
),
328
332
.
https://doi.org/10.1061/(ASCE)HY.1943-7900.0000804
.
Melville
B. W.
&
Chiew
Y. M.
(
1999
)
Time scale for local scour at bridge piers
,
J. Hydraul. Eng. - ASCE
,
125
(
2010
),
59
65
.
Mia
F.
&
Nago
H.
(
2003
)
Design method of time-dependent local scour at circular bridge pier
,
J. Hydraul. Eng.
,
126
(
6
),
420
427
.
Mohammadpour
R.
,
Ab. Ghani
A.
,
Zakaria
N. A.
&
Mohammed Ali
T. A.
(
2017, February
) ‘
Predicting scour at river bridge abutments over time
’,
Proceedings of the Institution of Civil Engineers-Water Management
, Vol.
170
(
1
).
Thomas Telford Ltd
, pp.
15
30
.
Molinas
A.
(
2004
)
Bridge Scour in Nonuniform Sediment Mixtures and in Cohesive Materials: Synthesis Report
. No. FHWA-RD-03-083,
McLean, VA: Department of Transportation
Federal Highway Administration
.
Nandi
B.
&
Das
S.
(
2023
). '
Equation for time-dependent local scour at pier-like structures with eccentric in-line arrangements
’,
Proceedings of the Institution of Civil Engineers-Water Management
.
Emerald Publishing Limited
. pp.
1
14
.
Oğuz
K.
&
Bor
A.
(
2021
)
Prediction of local scour around bridge piers using hierarchical clustering and adaptive genetic programming
,
Appl. Artif. Intell.
,
36
(
1
),
2001734
.
https://doi.org/10.1080/08839514.2021.2001734
.
Oliveto
G.
&
Hager
W. H.
(
2002
)
Temporal evolution of clear-water pier and abutment scour
,
J. Hydraul. Eng.
,
128
(
9
),
811
820
.
https://doi.org/10.1061/(asce)0733-9429(2002)128:9(811)
.
Pal
K.
&
Patel
B. V.
(
2020
). '
Data classification with k-fold cross validation and holdout accuracy estimation methods with 5 different machine learning techniques
’,
2020 4th International Conference on Computing Methodologies and Communication (ICCMC)
.
IEEE
, pp.
83
87
.
Pandey
M.
,
Sharma
P. K.
,
Ahmad
Z.
&
Karna
N.
(
2018
)
Maximum scour depth around bridge pier in gravel bed streams
,
Nat. Hazards
,
91
(
2
),
819
836
.
https://doi.org/10.1007/s11069-017-3157-z
.
Pandey
M.
,
Oliveto
G.
,
Pu
J. H.
,
Sharma
P. K.
&
Ojha
C. S. P.
(
2020
)
Pier scour prediction in non-uniform gravel beds
,
Water (Switzerland)
,
12
(
6
),
1696
.
https://doi.org/10.3390/W12061696
.
Shariati
M.
,
Mafipour
M. S.
,
Mehrabi
P.
,
Bahadori
A.
,
Zandi
Y.
,
Salih
M. N. A.
,
Nguyen
H.
,
Dou
J.
,
Song
X.
&
Poi-Ngian
S.
(
2019
)
Application of a hybrid artificial neural model in behavior prediction of channel shear connectors embedded in normal and
,
Appl. Sci.
,
9
,
5534
.
Sheppard
D. M.
,
Odeh
M.
&
Glasser
T.
(
2004
)
Large scale clear-water local pier scour experiments
,
J. Hydraul. Eng.
,
130
(
10
),
957
963
.
https://doi.org/10.1061/(asce)0733-9429(2004)130:10(957)
.
Tang
H.
,
Liu
Q.
,
Ph
D.
,
Zhou
J.
,
Guan
D.
,
Asce
M.
,
Yuan
S.
,
Asce
A. M.
,
Tang
L.
&
Zhang
H.
(
2023
)
Process-based design method for pier local scour depth under clear-water condition
,
J. Hydraul. Eng.
,
149
(
4
),
1
11
.
https://doi.org/10.1061/JHEND8.HYENG-13371
.
Tao
H.
,
Habib
M.
,
Aljarah
I.
,
Faris
H.
,
Afan
H. A.
&
Yaseen
Z. M.
(
2021
)
An intelligent evolutionary extreme gradient boosting algorithm development for modeling scour depths under submerged weir
,
Inf. Sci.
,
570
,
172
184
.
Yang
Y.
,
Melville
B. W.
,
Macky
G. H.
&
Shamseldin
A. Y.
(
2020
)
Experimental study on local scour at complex bridge pier under combined waves and current
,
Coast. Eng.
,
160
,
103730
.
https://doi.org/10.1016/j.coastaleng.2020.103730
.
Yanmaz
A. M.
&
Altinbilek
H. D.
(
1991
)
Study of time-Dependent local scour around bridge piers
,
J. Hydraul. Eng.
,
117
(
10
),
1247
1268
.
https://doi.org/10.1061/(asce)0733-9429(1991)117:10(1247).
Yoon
H.
(
2021
)
Finding unexpected test accuracy by cross validation in machine learning
,
Int. J. Comput. Sci. Netw. Secur.
,
21
(
12spc
),
549
555
.
Yousefpour
N.
&
Wang
B.
(
2024
)
Introducing a physics-informed deep learning framework for bridge scour prediction, arXiv preprint arXiv:2407.01258. https://doi.org/10.48550/arXiv.2407.01258
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-ND 4.0), which permits copying and redistribution with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nd/4.0/).

Supplementary data