ABSTRACT
Scouring around a bridge pier involves removing sediment from the riverbed and banks due to water flow. This paper employs eXtreme Gradient Boosting (XGBoost) and support vector machine with particle swarm optimization (SVM-PSO) machine learning (ML) approaches to model the temporal local scour depth around bridge piers under clear water scouring (CWS) conditions. CWS datasets, incorporating bridge pier geometry, flow characteristics, and sediment properties, are collected from existing literature. Five non-dimensional influencing parameters, such as ratio of pier width to flow depth (b/y), ratio of approach mean velocity to critical velocity (V/Vc), ratio of mean sediment size to pier width (d50/b), Froude number (Fr), and standard deviation of sediment (σg), are chosen as input parameters. XGBoost and SVM-PSO ML models demonstrate superior predictive capabilities, achieving coefficient of determination (R2) values exceeding 0.90 and mean absolute percentage error (MAPE) and root mean square error (RMSE) values less than 17.07% and 0.0341, respectively. Comparison with the previous four empirical models based on statistical indices reveals that the proposed XGBoost model outperforms SVM-PSO and empirical models in predicting scour depth, so it is recommended for estimating clear water scour depth under varying temporal conditions within the specified dataset range.
HIGHLIGHTS
A wide range of CWS datasets under temporal conditions is collected to develop the model.
The best combination of input parameters affecting the clear water scour depth is selected using the gamma test.
A model is built to predict the CWS depth using the eXtreme Gradient Boosting (XGBoost) and support vector machine (SVM)–particle swarm optimization (PSO) machine learning (ML) approaches.
The XGBoost ML approach is recommended based on the statistical indices, achieving more than 90% accuracy.
INTRODUCTION
Scouring is a process by which the abrasive force of water flow removes bed sediment in rivers. Minimizing the scouring phenomenon around bridge piers and foundations is crucial in the design and construction of bridges, as failure to do so has caused bridge collapses. Ensuring the safety and cost-effectiveness of bridge construction necessitates accurately estimating scour depth around the bridge pier. Determining the scour depth involves evaluating the scour phenomenon around bridge piers. The presence of bridge piers obstructs the flow route, resulting in the erosion of bed sediment and subsequent local scouring around the piers. According to Kothyari et al. (1992), this obstruction accelerates the flow and induces vortices that remove sediment from the vicinity of the bridge pier (Baranwal et al. 2023; Baranwal & Das 2023). Scour at bridge piers can take considerable time to reach equilibrium, especially when the flow velocity is close to the critical velocity required for the initial motion of sediment particles. Ettema et al. (1998) have observed that the temporal evolution of clear water scouring follows an asymptotic trend from zero to maximum scour depth, typically characterized by three phases: initial, primary, and equilibrium periods. Hydraulic engineers encounter significant challenges when excessive local scouring during floods leads to bridge collapses. The intricate scouring phenomena at bridge sites result from various factors, including debris flow, human activities, alterations in flow patterns around bridge structures, and localized scouring in conjunction with general riverbed sediment movement. Excessive scour depth can lead to undermining shallow foundations, exposing them to flow, which poses a risk to bridge safety. Conversely, overestimating scour depth could result in unnecessary project costs.
Numerous studies have been conducted to estimate scour depth around bridge piers (Sheppard et al. 2004; Lee & Sturm 2009; Pandey et al. 2018; Baranwal & Das 2023; Baranwal & Das 2024), and the influence of flow variability on bridge pier scour has been examined (Kothyari et al. 1992; Chang et al. 2004). This research has predominantly focused on scouring around piers with uniform cross-sections under temporal conditions. The significance of temporal variations in scour depth around such uniform piers has been investigated by Yanmaz & Altinbilek (1991), Kothyari et al. (1992), Melville & Chiew (1999), and Oliveto & Hager (2002). However, fewer studies exist on time-based differences in scour around rounded bridge piers. In recent years, various unconventional methods have been proposed to model bridge pier scouring. For instance, a study by Bateni et al. (2007) utilized neural networks and neuro-fuzzy evaluations to model the complex scouring near bridge piers. It was observed that pier width (b) significantly influences equilibrium scour depth compared to other input parameters. The generalized regression neural network (GRNN) model demonstrated superior performance in modelling scour depth and highlighted the effectiveness of the GRNN model in capturing the complexities of scour processes. However, it does not offer specific insights into its applicability under distinct flow regimes (Firat 2009; Firat & Gungor 2009). Support vector regression (SVR) tuned with radial basis function has shown superiority in predicting scour depth and identifying flow depth (y) and pier width (b) as crucial influencing factors for predicting scour depth around bridge piers. Artificial neural network (ANN) models exhibited the highest level of accuracy in predicting scour depth under both clear water scouring (CWS) and live bed scouring (LBS) conditions (Baranwal & Das 2023). However, the same level of performance was not observed when applied to field data, suggesting that ANN models may not be as effective in real-world scenarios. Selecting a diverse range of datasets is necessary to improve the accuracy of scour depth predictive models (Etemad-Shahidi et al. 2015). Additionally, a genetic function-based model outperformed both ANN and regression-based models, specifically in predicting clear water scour depth.
Particle swarm optimization (PSO) has demonstrated superior performance to other scour depth models, particularly for CWS datasets. It has identified parameters such as the ratio of pier width to flow depth (b/y) and mean sediment size to flow depth (d50/y) as the most influential factors for accurately predicting scour depth in laboratory and field data. The hybrid model combining ANN with PSO has outperformed models utilizing Levenberg–Marquardt (ANN–LM) and firefly algorithm (ANN-FA) in forecasting equilibrium scour depth under CWS conditions. Furthermore, the ANN–-PSO hybrid model shows promising potential for predicting improved sediment removal effectiveness in discharge scenarios (Shariati et al. 2019). The adaptive neuro-fuzzy inference system model emerges as the superior predictive model for both CWS and LBS conditions compared to the gene expression programming model and previous empirical scour depth prediction models (Choudhary et al. 2023). A scour depth prediction model for CWS and LBS conditions around bridge piers has been developed using support vector machine (SVM) techniques, yielding satisfactory results (Baranwal & Das 2023). Eini et al. (2024) model the scour depth by combining Bayesian optimization (BO) with SVM and eXtreme Gradient Boosting (XGBoost) and claimed that BO–XGBoost outperformed based on lower root mean square error (RMSE), mean absolute percentage error (MAPE), and higher R2 values.
A comprehensive literature review highlighted the effectiveness of XGBoost and SVM–PSO machine learning (ML) approaches, which have been utilized in the different domains of water resource engineering and demonstrated exceptional performance in predicting scour depth, outperforming other ML models (Tao et al. 2021; Chang et al. 2023; Kumar et al. 2024a, b). The temporal scour depth modelling literature indicates that only a relatively limited range of datasets has been used to develop prediction models under CWS conditions. Moreover, an accurate empirical temporal CWS-based scour depth prediction model is lacking. Additionally, no ML model can yield a better scour depth ratio (SDR) for the chosen datasets. Many ML models are adequate for laboratory experimental datasets, yet they often underperform in field conditions. Consequently, this research addresses a notable gap in the existing literature by employing the XGBoost and SVM–PSO ML techniques to predict scour depth in CWS conditions. Selecting a boosting model, such as XGBoost, is advantageous because it can model complex, non-linear relationships without the need for predefined interactions. The SVM–PSO model can reduce overfitting and enhance prediction accuracy. Despite the substantial implications for hydraulic engineering, there has been limited exploration of utilizing XGBoost and SVM–PSO ML techniques for predicting scour depth under CWS conditions. By employing both ML techniques, this research provides a more reliable, robust, and flexible model for predicting temporal scour depth under clear water scouring.
The present study aims to fill all these gaps, providing crucial insights to estimate the temporal SDR in CWS conditions. The scarcity of research in this domain underscores the significance of the present investigation, which contributes significantly to the understanding and prediction of scour depth in field conditions. This study outlines the crucial parameters related to geometry, roughness, and flow characteristics. It collects a wide range of previous experimental and field temporal-based datasets to develop a scour depth under the CWS condition prediction model using XGBoost and SVM–PSO ML techniques. The gamma test (GT) has been performed to identify the best input parameter combination. The efficacy of both ML techniques has been assessed using existing empirical models, and the performance has been compared. A recommendation is to utilize a better ML model capable of predicting temporal scour depth under CWS conditions by effectively managing a wide range of input parameters to mitigate the risk of bridge pier failure, potentially saving lives and reducing financial burdens on the country.
MATERIALS AND METHODS
Selecting significant input parameters under temporal conditions for modelling scour depth
Dataset collection
Collecting the field or experimental datasets to develop ML approaches such as XGBoost and SVM–PSO-based models to predict local scour depth is essential. However, no experimental study has been performed to collect datasets. Instead, only previously available datasets were collected from various sources, as shown in Table 1. The experimental data were obtained mainly for clear water scouring (CWS) conditions, i.e., when the ratio of approach mean velocity to critical velocity (V/Vc) is less than or equal to 1.0. In the present study, a total of 501 datasets were collected from previous studies, of which 75% were utilized for training purposes and the remaining 25% for testing purposes. The XGBoost and SVM–PSO ML models utilized five non-dimensional input parameters (b/y, V/Vc, Fr, d50/b, σg, and Vt/b) and one output parameter known as SDR, represented as ds/y.
Various clear water scouring (CWS) parameters collected from previous publications
Authors and years . | b/y . | V/Vc . | Fr . | d50/b . | σg . | Vt/b . | ds/y . |
---|---|---|---|---|---|---|---|
Yanmaz & Altinbilek (1991) | 0.280–1.480 | 0.430–0.770 | 0.250–0.290 | 0.010–0.020 | 1.130–1.280 | 3.00–6.00 | 0.136–0.231 |
Melville & Chiew (1999) | 0.050–5.080 | 0.400–1.000 | 0.120–0.880 | 0.004–0.080 | 1.010–3.150 | 3.33–250.50 | 0.051–0.576 |
Mia & Nago (2003) | 0.140–1.500 | 0.700–0.900 | 0.120–0.400 | 0.002–0.060 | 1.200–1.290 | 2.33–313.63 | 0.080–0.317 |
Chang et al. (2004) | 5.000–6.666 | 0.805–0.845 | 0.592–0.632 | 0.007–0.010 | 1.200–3.000 | 19.00–56.00 | 0.202–0.950 |
Molinas (2004) | 0.089–1.275 | 0.205–0.999 | 0.110–0.724 | 0.003–0.095 | 1.150–3.700 | 8.00–30.00 | 0.005–0.214 |
Sheppard et al. (2004) | 0.086–5.352 | 0.609–0.970 | 0.071–0.387 | 0.001–0.007 | 1.210–1.510 | 18.83–616.00 | 0.066–0.642 |
Grimaldi (2005) | 0.360–0.500 | 1.000–1.000 | 0.210–0.330 | 0.005–0.017 | 1.440–1.460 | 96.00–148.35 | 0.131–0.187 |
Lanca et al. (2013) | 0.200–2.000 | 0.278–0.347 | 0.259–0.317 | 0.002–0.017 | 1.360–1.360 | 168.00–330.00 | 0.121–0.549 |
López et al. (2014) | 0.250–0.810 | 0.500–0.860 | 0.165–0.336 | 0.017–0.018 | 1.320 | 2.00–145.20 | 0.080–0.189 |
Fael et al. (2016) | 0.250–1.000 | 0.960 | 0.220 | 0.004–0.017 | 1.360 | 150.00–316.32 | 0.143–0.348 |
Aksoy et al. (2017) | 0.213–0.227 | 0.450–0.560 | 0.259–0.326 | 0.017–0.086 | 1.3900. | 6.68 | 0.075–0.149 |
Pandey et al. (2020) | 0.550–0.700 | 0.640–0.900 | 0.138–0.203 | 0.003–0.004 | 1.220 | 13.53–17.50 | 0.106–0.230 |
Yang et al. (2020) | 0.135–0.477 | 0.530 | 0.099–0.099 | 0.004–0.013 | 1.300 | 0.07–12.41 | 0.057–0.105 |
Authors and years . | b/y . | V/Vc . | Fr . | d50/b . | σg . | Vt/b . | ds/y . |
---|---|---|---|---|---|---|---|
Yanmaz & Altinbilek (1991) | 0.280–1.480 | 0.430–0.770 | 0.250–0.290 | 0.010–0.020 | 1.130–1.280 | 3.00–6.00 | 0.136–0.231 |
Melville & Chiew (1999) | 0.050–5.080 | 0.400–1.000 | 0.120–0.880 | 0.004–0.080 | 1.010–3.150 | 3.33–250.50 | 0.051–0.576 |
Mia & Nago (2003) | 0.140–1.500 | 0.700–0.900 | 0.120–0.400 | 0.002–0.060 | 1.200–1.290 | 2.33–313.63 | 0.080–0.317 |
Chang et al. (2004) | 5.000–6.666 | 0.805–0.845 | 0.592–0.632 | 0.007–0.010 | 1.200–3.000 | 19.00–56.00 | 0.202–0.950 |
Molinas (2004) | 0.089–1.275 | 0.205–0.999 | 0.110–0.724 | 0.003–0.095 | 1.150–3.700 | 8.00–30.00 | 0.005–0.214 |
Sheppard et al. (2004) | 0.086–5.352 | 0.609–0.970 | 0.071–0.387 | 0.001–0.007 | 1.210–1.510 | 18.83–616.00 | 0.066–0.642 |
Grimaldi (2005) | 0.360–0.500 | 1.000–1.000 | 0.210–0.330 | 0.005–0.017 | 1.440–1.460 | 96.00–148.35 | 0.131–0.187 |
Lanca et al. (2013) | 0.200–2.000 | 0.278–0.347 | 0.259–0.317 | 0.002–0.017 | 1.360–1.360 | 168.00–330.00 | 0.121–0.549 |
López et al. (2014) | 0.250–0.810 | 0.500–0.860 | 0.165–0.336 | 0.017–0.018 | 1.320 | 2.00–145.20 | 0.080–0.189 |
Fael et al. (2016) | 0.250–1.000 | 0.960 | 0.220 | 0.004–0.017 | 1.360 | 150.00–316.32 | 0.143–0.348 |
Aksoy et al. (2017) | 0.213–0.227 | 0.450–0.560 | 0.259–0.326 | 0.017–0.086 | 1.3900. | 6.68 | 0.075–0.149 |
Pandey et al. (2020) | 0.550–0.700 | 0.640–0.900 | 0.138–0.203 | 0.003–0.004 | 1.220 | 13.53–17.50 | 0.106–0.230 |
Yang et al. (2020) | 0.135–0.477 | 0.530 | 0.099–0.099 | 0.004–0.013 | 1.300 | 0.07–12.41 | 0.057–0.105 |
Different temporal clear water scour depth predictive equations
The temporal variation of scour depth around the bridge pier was assessed through numerical simulations and experimental investigations, focusing predominantly on laboratory-based studies. Numerous scholars and researchers have developed formulas to estimate time-based scour depth around rounded bridge piers, considering a wide range of flow characteristics and bed compositions (Oliveto & Hager 2002; Lanca et al. 2013; Franzetti et al. 2022; Tang et al. 2023). All these empirical SDR predictive models are extensively utilized to estimate scour depth around bridge piers under temporal conditions. Oliveto & Hager (2002) proposed an empirical equation for time-dependent or variable flow conditions, indicating scour depth variations over time. Lanca et al. (2013) formulated a scour depth prediction model incorporating flow depth (y), pier width (b), and mean sediment size (d50), considering various ratios of y/b and d50/b. Similarly, Franzetti et al. (2022) and Tang et al. (2023) presented scour prediction models based on y, b, flow velocity (V), and d50. The feature selection of the empirical models highlights key input parameter variables, such as geometrical parameter (b), flow parameter (y, V), bed roughness parameter (d50, σg,Vc, Fr), and temporal parameter (t) essential for feature selection in ML (Kumar et al. 2024a, b). Understanding scour dynamics from these empirical models informs the conceptual framework for input–output relationships in XGBoost and SVM–PSO approaches (Shafagh Loron et al. 2023). While empirical models provide a base, their limitations necessitate integrating advanced data-driven techniques to enhance predictive accuracy and adapt to temporal changes in scour depth. Input–output conceptualization relationships are also established in the empirical equations that guided understanding how various input factors interact to influence scour depth. This conceptual framework helped to design and present XGBoost and SVM–PSO ML models to predict scour depth more accurately.
All four empirical models utilize sediment transport theories, mainly focusing on the volumetric rate of sediment transport and critical bed-shear stress acting on sediment particles to establish empirical equation relationships for scour depth over time (Mohammadpour et al. 2017; Wang et al. 2024). While Oliveto & Hager (2002) are primarily empirical, relying on laboratory data to derive relationships between variables, often resulting in straightforward models. The model may not perform well under extreme conditions or with highly variable sediment compositions, leading to potential inaccuracies in scour depth predictions. It depends on empirical constants that may limit its adaptability to diverse environmental conditions. Lanca et al.’s (2013) model emphasized the significance of flow depth and sediment size ratios in their predictive equations and provided accurate scour predictions by including different equations based on the ratio of b/d50, allowing for flexibility in application across various sediment sizes. The model tends to under-predict scour depth for specific ranges, particularly when b/d50 values exceed 500.0, as indicated by the present study, and its complexity may complicate implementation in practical scenarios without adequate calibration. Lanca et al. (2013) and Franzetti et al. (2022) adopt a semi-analytical approach, such as combining empirical fitting with fundamental physical principles, addressing scale effects and improving long-term prediction accuracy (Cui et al. 2019; Wang et al. 2024), and these models improve long-term predictions by integrating empirical data with theoretical frameworks, thus addressing uncertainties inherent in purely empirical models (Nandi & Das 2023). The limitation of Franzetti et al. (2022) is that this model under-predicts scour depth in the range of 0.3–2.4, suggesting limitations in its applicability to the present range of datasets. The complexity of the model may hinder its usability in real-time applications without significant computational resources.
Tang et al.’s (2023) model introduces a fresh perspective by integrating sediment density differences and flow characteristics, which may improve predictions in specific scenarios. It has the potential to capture the effects of varying sediment compositions and flow conditions. For the present range of datasets, this empirical model revealed that it often over-predicts scour depth in the range of 0.2–1.2. The dependency of the model on multiple parameters may complicate its calibration and application in diverse domains of scour depth prediction. Franzetti et al. (2022) and Tang et al. (2023) refined their models by incorporating flow velocity and sediment gradation, enhancing predictive accuracy. The overall limitation of all empirical scour depth prediction models is that they may lack the robustness needed for long-term predictions, potentially leading to underestimations of scour depth in real-world applications. Despite advancements, challenges remain in accurately predicting scour depth due to the complex interactions of hydraulic phenomena and sediment transport, necessitating ongoing research and model refinement (Tang et al. 2023). The effectiveness of empirical approaches is often compromised by the diversity of datasets gathered from various researchers, resulting in inconsistencies and potential differences. Furthermore, the restricted applicability of empirical formulas within certain parameters impedes accurate prediction and broader application.
METHODOLOGY
Gamma test
K-fold cross-validation techniques
The K-fold cross-validation technique is commonly employed to address overfitting issues in ML and evaluate the generalization capabilities of various algorithms (Saha et al. 2021 and Yoon 2021). This approach segments the output data into k portions or folds (Pal & Patel 2020), with one fold serving as the test set and the remaining folds used for model training. Overfitting is a prevalent issue in computational modelling (Asteris et al. 2021), where an ML model may accurately predict outcomes for a specific dataset during training and testing but produce highly unusual results when applied to datasets from different experimental conditions. Consequently, validating predictive models using additional datasets is essential to ensure their reliability (Bardhan et al. 2021). For this purpose, four-fold cross-validation (Kumar et al. 2024a, b) was implemented to identify the most suitable testing datasets. This process divided the main dataset into four equal segments (CV-1–CV-4) (Hadavimoghaddam et al. 2021). One subset was designated as the testing dataset. In contrast, the remaining subsets were used as the training dataset to minimize the risk of overfitting to a particular training set and maximize the utilization of the entire dataset. The primary goal of the cross-validation was to optimize the model construction for enhanced prediction accuracy during the validation stage.
Normalization of datasets
It has been utilized to improve the accuracy and swiftness of the ML approaches to convey input and output data between 0.05 and 0.95. Where anorm is the normalized input range, a is the original input range, amin is the minimum input range, and amax is the maximum input range.
XGBoost ML approaches
A flow chart of the local scour depth prediction model based on XGBoost ML approaches.
A flow chart of the local scour depth prediction model based on XGBoost ML approaches.
SVM–PSO ML approaches
A flow chart of the local scour depth prediction model based on SVM–PSO approaches.
A flow chart of the local scour depth prediction model based on SVM–PSO approaches.
The XGBoost ML model effectively identifies influential variables, such as time scale and sediment characteristics, and its high performance in structured data and its ability to capture complex relationships through an ensemble of decision trees are crucial for accurate predictions (Kumar et al. 2024a, b). The SVM–PSO ML approach utilizes extensive datasets, and it was chosen for its robustness in high-dimensional spaces and enhanced parameter tuning via PSO, which enhances its prediction reliability by improving accuracy in complex scenarios (Nandi et al. 2024). Hence, while XGBoost and SVM–PSO provide advanced predictive capabilities, the reliance on data-driven models may overlook the physical principles governing scour processes, potentially leading to less interpretable results compared to hybrid models that integrate physics-based insights (Yousefpour & Wang 2024).
Statistical indices
RESULTS AND DISCUSSIONS
The performance of the two selected ML approaches, i.e., XGBoost and SVM–PSO, is compared to check their reliability and accuracy in predicting the temporal clear water scour depth.
Result obtained from the GT and k-fold cross-validation techniques
During the GT, a series of 26 experiments were conducted, and the top 10 experiments are outlined in Table 2. It is revealed that the combination of five distinct input parameters with the mask (110,111) yields a more effective model than other combinations, as evidenced by its notably lower values of gamma and Vratio, approaching zero compared to other combinations, as shown in Table 2. This configuration, comprising five distinct non-dimensional parameters, was employed in XGBoost and SVM–PSO methodologies to develop a temporal model for clear water scour depth. The evaluation of various temporal clear water scour depth models using the GT is presented in Appendix Table A1.
Various combinations of input parameters related to CWS determined from the GT
S. No. . | Combination of input parameters . | Mask . | Gamma value . | Gradient . | Standard error . | V-ratio . | Nearest neighbour . |
---|---|---|---|---|---|---|---|
1 | (b/y, V/Vc, d50/b, σg, Vt/b) | 110,111 | −0.0001 | 0.2663 | 0.0037 | −0.0004 | 8 |
2 | (b/y, Fr, d50/b, σg, Vt/b) | 101,111 | −0.0115 | 0.4053 | 0.0083 | −0.0461 | 10 |
3 | (b/y, V/Vc, Fr, σg) | 111,010 | −0.0118 | 0.4017 | 0.0092 | −0.0469 | 10 |
4 | (b/y, V/Vc, d50/b, Vt/b) | 110,101 | 0.0030 | 0.2440 | 0.0056 | 0.0120 | 10 |
5 | (b/y, V/Vc, Fr, σg, Vt/b) | 111,011 | 0.0033 | 0.2026 | 0.0059 | 0.0132 | 8 |
6 | (b/y, V/Vc, d50/b, σg, Vt/b) | 110,111 | 0.0038 | 0.1915 | 0.0034 | 0.0154 | 12 |
7 | (b/y, V/Vc, Fr, σg, Vt/b) | 111,011 | 0.0050 | 0.1858 | 0.0035 | 0.0199 | 10 |
8 | (b/y, V/Vc, Fr, σg) | 111,010 | 0.0052 | 0.2305 | 0.0056 | 0.0207 | 12 |
9 | (b/y, V/Vc, Fr) | 111,000 | 0.0066 | 0.3233 | 0.0053 | 0.0264 | 8 |
10 | (b/y, V/Vc, d50/b, Vt/b) | 110,101 | 0.0073 | 0.1949 | 0.0034 | 0.0290 | 12 |
S. No. . | Combination of input parameters . | Mask . | Gamma value . | Gradient . | Standard error . | V-ratio . | Nearest neighbour . |
---|---|---|---|---|---|---|---|
1 | (b/y, V/Vc, d50/b, σg, Vt/b) | 110,111 | −0.0001 | 0.2663 | 0.0037 | −0.0004 | 8 |
2 | (b/y, Fr, d50/b, σg, Vt/b) | 101,111 | −0.0115 | 0.4053 | 0.0083 | −0.0461 | 10 |
3 | (b/y, V/Vc, Fr, σg) | 111,010 | −0.0118 | 0.4017 | 0.0092 | −0.0469 | 10 |
4 | (b/y, V/Vc, d50/b, Vt/b) | 110,101 | 0.0030 | 0.2440 | 0.0056 | 0.0120 | 10 |
5 | (b/y, V/Vc, Fr, σg, Vt/b) | 111,011 | 0.0033 | 0.2026 | 0.0059 | 0.0132 | 8 |
6 | (b/y, V/Vc, d50/b, σg, Vt/b) | 110,111 | 0.0038 | 0.1915 | 0.0034 | 0.0154 | 12 |
7 | (b/y, V/Vc, Fr, σg, Vt/b) | 111,011 | 0.0050 | 0.1858 | 0.0035 | 0.0199 | 10 |
8 | (b/y, V/Vc, Fr, σg) | 111,010 | 0.0052 | 0.2305 | 0.0056 | 0.0207 | 12 |
9 | (b/y, V/Vc, Fr) | 111,000 | 0.0066 | 0.3233 | 0.0053 | 0.0264 | 8 |
10 | (b/y, V/Vc, d50/b, Vt/b) | 110,101 | 0.0073 | 0.1949 | 0.0034 | 0.0290 | 12 |
A bar chart illustrating the outcomes of four-fold cross-validation (CV-1 to CV-4) (determined by R2 value).
A bar chart illustrating the outcomes of four-fold cross-validation (CV-1 to CV-4) (determined by R2 value).
Performance of XGBoost ML approaches
In XGBoost modelling, a total of 501 datasets, 75% of the training dataset and 25% of the testing datasets are selected to model the scour depth, respectively. Six input parameter combinations, such as b/y, V/Vc, Fr, d50/b, σg, and t, were used to train and tested to model clear water scour depth under temporal conditions. To check the strength of the XGBoost ML approaches, three statical indices, MAPE, RMSE, and R2, are utilized for training and testing datasets. An error analysis was performed to select the best model using the XGBoost ML approaches. The outcomes for scour depth prediction using XGBoost ML approaches are presented in Table 3. Table 3 shows different variations of hyperparameters of learning rate and maximum depth with MAPE, RMSE, and R2 values for training and testing datasets. It is shown to optimize the hyperparameter of the XGBoost model at a learning rate of 0.1, and the maximum depth of 6.0 provides better results, as depicted in Table 3. The performance of various temporal clear water scour depth models using the XGBoost ML approaches is shown in Appendix Table A2.
Training and testing datasets result for different parameters using (b/y, V/Vc, Fr, d50/b, σg, Vt/b) by the XGBoost model
S. No. . | XGBoost optimal parameter . | Training data . | Testing data . | |||||
---|---|---|---|---|---|---|---|---|
Learning rate . | Max depth . | MAPE . | RMSE . | R2 . | MAPE . | RMSE . | R2 . | |
1 | 0.1 | 6.0 | 1.57 | 0.0051 | 0.95245 | 12.69 | 0.0294 | 0.96777 |
2 | 0.4 | 7.0 | 1.34 | 0.0049 | 0.96192 | 12.84 | 0.0305 | 0.96533 |
3 | 0.3 | 5.0 | 1.46 | 0.0050 | 0.94188 | 13.43 | 0.0305 | 0.96527 |
4 | 0.4 | 9.0 | 1.10 | 0.0047 | 0.94138 | 13.72 | 0.0306 | 0.96520 |
5 | 0.2 | 8.0 | 1.32 | 0.0049 | 0.95618 | 13.06 | 0.0308 | 0.96478 |
6 | 0.1 | 8.0 | 1.45 | 0.0050 | 0.96176 | 13.41 | 0.0308 | 0.96464 |
7 | 0.4 | 10.0 | 1.12 | 0.0047 | 0.95129 | 13.72 | 0.0310 | 0.96428 |
8 | 0.4 | 6.0 | 1.25 | 0.0048 | 0.95168 | 13.85 | 0.0314 | 0.96336 |
9 | 0.4 | 8.0 | 1.22 | 0.0047 | 0.93123 | 14.48 | 0.0324 | 0.96102 |
10 | 0.4 | 5.0 | 1.37 | 0.0048 | 0.95616 | 16.09 | 0.0382 | 0.94577 |
S. No. . | XGBoost optimal parameter . | Training data . | Testing data . | |||||
---|---|---|---|---|---|---|---|---|
Learning rate . | Max depth . | MAPE . | RMSE . | R2 . | MAPE . | RMSE . | R2 . | |
1 | 0.1 | 6.0 | 1.57 | 0.0051 | 0.95245 | 12.69 | 0.0294 | 0.96777 |
2 | 0.4 | 7.0 | 1.34 | 0.0049 | 0.96192 | 12.84 | 0.0305 | 0.96533 |
3 | 0.3 | 5.0 | 1.46 | 0.0050 | 0.94188 | 13.43 | 0.0305 | 0.96527 |
4 | 0.4 | 9.0 | 1.10 | 0.0047 | 0.94138 | 13.72 | 0.0306 | 0.96520 |
5 | 0.2 | 8.0 | 1.32 | 0.0049 | 0.95618 | 13.06 | 0.0308 | 0.96478 |
6 | 0.1 | 8.0 | 1.45 | 0.0050 | 0.96176 | 13.41 | 0.0308 | 0.96464 |
7 | 0.4 | 10.0 | 1.12 | 0.0047 | 0.95129 | 13.72 | 0.0310 | 0.96428 |
8 | 0.4 | 6.0 | 1.25 | 0.0048 | 0.95168 | 13.85 | 0.0314 | 0.96336 |
9 | 0.4 | 8.0 | 1.22 | 0.0047 | 0.93123 | 14.48 | 0.0324 | 0.96102 |
10 | 0.4 | 5.0 | 1.37 | 0.0048 | 0.95616 | 16.09 | 0.0382 | 0.94577 |
Performance of SVM–PSO ML approaches
In SVM–PSO modelling, 75 and 25% of data are arbitrarily nominated to model scour depth for training and testing. For various optimal parameters of SVM–PSO techniques, swarm sizes C1 = cognitive factor and C2 = social factor are utilized, and their corresponding results are shown in Table 4. The sum of the values of C1 and C2 is 4.0. In Table 4, the best combination for the optimal parameter of SVM–PSO is observed at swarm size (C1) = 0.3, C2 = 3.7, P = 25.0, C = 132.0, γ = 3.7, ε = 0.6, d = 3.0. At this parameter, the values of statical indices such as MAPE, RMSE, and R2 are 17.070 (lower), 0.03414 (lower), and 0.94844 (higher), respectively, which provides good results among the other selected optimal parameter combinations. The performance of various temporal clear water scour depth models using the XGBoost ML approaches is mentioned in Appendix Table A3.
Training and testing data results for different parameters using (b/y, V/Vc, Fr, d50/b, σg,Vt/b) by the SVM–PSO model
S. No. . | SVM–PSO optimal parameter . | Training data . | Testing data . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 . | C2 . | P . | C . | γ . | ε . | d . | MAPE . | RMSE . | R2 . | MAPE . | RMSE . | R2 . | |
1 | 0.3 | 3.7 | 25.0 | 132.0 | 3.7 | 0.6 | 3.0 | 7.231 | 0.02085 | 0.949848 | 17.070 | 0.03414 | 0.94844 |
2 | 2.4 | 1.6 | 25.0 | 380.0 | 35.0 | 2.3 | 3.0 | 7.207 | 0.02081 | 0.949849 | 17.075 | 0.03416 | 0.94838 |
3 | 0.8 | 3.6 | 25.0 | 48.0 | 39.0 | 2.6 | 3.0 | 7.110 | 0.02068 | 0.949853 | 17.149 | 0.03417 | 0.94833 |
4 | 0.8 | 3.3 | 25.0 | 749.0 | 76.0 | 9.9 | 3.0 | 7.241 | 0.02087 | 0.949847 | 17.107 | 0.03418 | 0.94830 |
5 | 3.4 | 0.6 | 25.0 | 225.0 | 47.0 | 3.4 | 3.0 | 7.072 | 0.02063 | 0.949854 | 17.153 | 0.03419 | 0.94827 |
6 | 1.3 | 2.8 | 25.0 | 625.0 | 97.0 | 1.8 | 3.0 | 6.806 | 0.02024 | 0.949865 | 17.537 | 0.03479 | 0.94644 |
7 | 1.8 | 2.3 | 25.0 | 375.0 | 24.0 | 9.7 | 3.0 | 9.300 | 0.02581 | 0.949642 | 17.176 | 0.03540 | 0.94456 |
8 | 2.4 | 1.7 | 25.0 | 214.0 | 83.0 | 0.9 | 3.0 | 4.522 | 0.01748 | 0.949925 | 22.398 | 0.03990 | 0.92955 |
9 | 1.0 | 3.0 | 25.0 | 250.0 | 3.0 | 0.3 | 3.0 | 4.525 | 0.01747 | 0.949925 | 22.550 | 0.03997 | 0.92931 |
10 | 0.9 | 3.1 | 25.0 | 253.0 | 4.8 | 1.7 | 3.0 | 4.496 | 0.01746 | 0.949925 | 22.660 | 0.03998 | 0.92928 |
S. No. . | SVM–PSO optimal parameter . | Training data . | Testing data . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C1 . | C2 . | P . | C . | γ . | ε . | d . | MAPE . | RMSE . | R2 . | MAPE . | RMSE . | R2 . | |
1 | 0.3 | 3.7 | 25.0 | 132.0 | 3.7 | 0.6 | 3.0 | 7.231 | 0.02085 | 0.949848 | 17.070 | 0.03414 | 0.94844 |
2 | 2.4 | 1.6 | 25.0 | 380.0 | 35.0 | 2.3 | 3.0 | 7.207 | 0.02081 | 0.949849 | 17.075 | 0.03416 | 0.94838 |
3 | 0.8 | 3.6 | 25.0 | 48.0 | 39.0 | 2.6 | 3.0 | 7.110 | 0.02068 | 0.949853 | 17.149 | 0.03417 | 0.94833 |
4 | 0.8 | 3.3 | 25.0 | 749.0 | 76.0 | 9.9 | 3.0 | 7.241 | 0.02087 | 0.949847 | 17.107 | 0.03418 | 0.94830 |
5 | 3.4 | 0.6 | 25.0 | 225.0 | 47.0 | 3.4 | 3.0 | 7.072 | 0.02063 | 0.949854 | 17.153 | 0.03419 | 0.94827 |
6 | 1.3 | 2.8 | 25.0 | 625.0 | 97.0 | 1.8 | 3.0 | 6.806 | 0.02024 | 0.949865 | 17.537 | 0.03479 | 0.94644 |
7 | 1.8 | 2.3 | 25.0 | 375.0 | 24.0 | 9.7 | 3.0 | 9.300 | 0.02581 | 0.949642 | 17.176 | 0.03540 | 0.94456 |
8 | 2.4 | 1.7 | 25.0 | 214.0 | 83.0 | 0.9 | 3.0 | 4.522 | 0.01748 | 0.949925 | 22.398 | 0.03990 | 0.92955 |
9 | 1.0 | 3.0 | 25.0 | 250.0 | 3.0 | 0.3 | 3.0 | 4.525 | 0.01747 | 0.949925 | 22.550 | 0.03997 | 0.92931 |
10 | 0.9 | 3.1 | 25.0 | 253.0 | 4.8 | 1.7 | 3.0 | 4.496 | 0.01746 | 0.949925 | 22.660 | 0.03998 | 0.92928 |
Note. C1, cognitive factor; C2, social factor; d, degree.
Sensitivity analysis of input parameters to model the temporal scour depth
The incorporation of sensitivity analysis shows how variations in input parameters (b/y, V/Vc, Fr, d50/b, Vt/b, σg) impact the output of the SDR (ds/y), especially concerning stability parameters crucial for system stability.
The findings illustrated in Figure 5 indicate that the SDR (ds/y) is primarily influenced by b/y, Vt/b, Fr, d50/b, σg, and V/Vc, respectively, for both XGBoost and SVM–PSO models, as shown in Figure 5. The correlation coefficients (Rij) for the XGBoost model (b/y = 6.194, Vt/b = 3.312, Fr = 3.181, d50/b = 2.312, σg = 2.128, and V/Vc = 1.406) and SVM–PSO model (b/y = 6.242, Vt/b = 3.371, Fr = 3.225, d50/b = 2.350, σg = 2.150, and V/Vc = 1.426) are provided. The XGBoost model demonstrates superior performance compared to the SVM–PSO model.
Comparison of the present models with the previous scour depth predictive models
Scatter plot of predicted vs. observed SDR for SVM–PSO (present model) and XGBoost (present model).
Scatter plot of predicted vs. observed SDR for SVM–PSO (present model) and XGBoost (present model).
Scatter plot of predicted vs. observed SDR estimated using the existing empirical equations of different researchers and the present developed model.
Scatter plot of predicted vs. observed SDR estimated using the existing empirical equations of different researchers and the present developed model.
Comparison of range-wise statistical error analysis of presently developed model results with previous existing scour depth models
The statistical measures of error, MAPE, RMSE, and R2 are calculated for the present XGBoost and SVM–PSO ML models. These calculations are compared to four scour depth predictive models: Oliveto & Hager (2002), Lanca et al. (2013), Franzetti et al. (2022), and Tang et al. (2023) under different b/y, V/Vc, and Fr conditions. The evaluation of range-wise errors across input non-dimensional parameters such as b/y, V/Vc, and Fr allows for assessing the efficacy of the presently developed models, i.e., XGBoost and SVM–PSO, compared to other existing empirical equations.
To perform the range-wise error analysis, the following input parameters ranges b/y, V/Vc, and Fr are selected as follows:
(a) b/y ≤ 0.25, 0.25 < b/y ≤ 0.5, 0.5 < b/y ≤ 1.5, and b/y > 1.5
(b) V/Vc ≤ 0.25, 0.25 < V/Vc ≤ 0.5, 0.50 < V/Vc ≤ 0.75, and 0.75 < V/Vc ≤ 1.0
(c) Fr ≤ 0.2, 0.2 < F ≤0.4, 0.4 < Fr ≤ 0.6, and 0.6 < Fr ≤ 0.0
Tables 5–7 present the RMSE, MAPE, and R2 values for the XGBoost and SVM–PSO ML models, along with four scour depth prediction equations across various ranges of b/y, V/Vc, and Fr. For b/y ≤ 0.25, Tang et al. (2023) exhibit the highest MAPE value, while XGBoost and SVM–PSO (present models) show the lowest. Additionally, Tang et al. (2023) demonstrate the highest RMSE, followed by Lanca et al. (2013) and Franzetti et al. (2022), and the lowest R2 value. In the 0.25 < b/y ≤ 1.0 range, Tang et al. (2023) show higher MAPE and RMSE values, while, for b/y > 1.5, Franzetti et al. (2022) have the highest MAPE, followed by Tang et al. (2023). The XGBoost model developed in this study consistently outperforms other models across all b/y ranges, with lower MAPE and RMSE values and higher R2 values. Table 6 displays the performance metrics for the proposed XGBoost and SVM–PSO models, as well as the four scour depth prediction equations within specific V/Vc ranges. For V/Vc ≤ 0.25, Tang et al. (2023) exhibit high MAPE and RMSE values and a low R2 value. In the 0.25 < V/Vc ≤ 0.5 range, Lanca et al. (2023) and Franzetti et al. (2022) share the same MAPE value. The SVM–PSO model shows higher error values than the XGBoost model in all ranges except 0.5 < V/Vc < 0.7. For the remaining V/Vc ranges, Tang et al. (2023) demonstrate higher error values, while the XGBoost model shows lower error values. Table 7 reveals that for Fr ≤ 0.2, the XGBoost model (present model) has the lowest RMSE and highest R2 values. Furthermore, for the 0.4 < Fr ≤ 0.6 range, the SVM–PSO model (present model) exhibits the highest R2 value. For 0.6 < Fr ≤ 1.0, Franzetti et al. (2022) show the highest MAPE value among all models. Overall, the XGBoost model consistently demonstrates higher R2 values and lower MAPE and RMSE values, while Tang et al. (2023) show lower R2 values and higher MAPE and RMSE values. The study concludes that the XGBoost model demonstrates superior performances in both prediction accuracy and adaptability to varying flow and sediment conditions. In contrast, previous empirical models produce unsatisfactory outcomes with high errors and low R2 values.
Calculation of the errors in various approaches for determining ds/y within the selected ranges of b/y
Different approaches . | b/y ≤ 0.25 . | 0.25 < b/y ≤ 0.5 . | 0.5 < b/y ≤ 1.5 . | b/y > 1.5 . |
---|---|---|---|---|
XGBoost (present model) | 22.23 | 13.52 | 15.18 | 14.35 |
0.23 | 0.11 | 0.19 | 0.26 | |
0.88 | 0.89 | 0.94 | 0.92 | |
PSO–SVM (present model) | 47.29 | 37.17 | 32.58 | 25.46 |
0.34 | 0.42 | 0.40 | 0.57 | |
0.76 | 0.69 | 0.78 | 0.81 | |
Oliveto & Hager (2002) | 25.01 | 101.25 | 112.98 | 95.25 |
0.56 | 0.61 | 0.70 | 0.87 | |
0.50 | 0.34 | 0.21 | 0.32 | |
Lanca et al. (2013) | 44.64 | 119.67 | 125.76 | 98.16 |
0.66 | 0.81 | 0.76 | 0.89 | |
0.47 | 0.32 | 0.20 | 0.31 | |
Franzetti et al. (2022) | 93.55 | 152.58 | 176.15 | 215.28 |
0.40 | 0.51 | 0.75 | 1.03 | |
0.39 | 0.29 | 0.22 | 0.23 | |
Tang et al. (2023) | 115.89 | 168.52 | 250.59 | 205.85 |
1.14 | 1.58 | 2.65 | 1.98 | |
0.19 | 0.23 | 0.16 | 0.18 |
Different approaches . | b/y ≤ 0.25 . | 0.25 < b/y ≤ 0.5 . | 0.5 < b/y ≤ 1.5 . | b/y > 1.5 . |
---|---|---|---|---|
XGBoost (present model) | 22.23 | 13.52 | 15.18 | 14.35 |
0.23 | 0.11 | 0.19 | 0.26 | |
0.88 | 0.89 | 0.94 | 0.92 | |
PSO–SVM (present model) | 47.29 | 37.17 | 32.58 | 25.46 |
0.34 | 0.42 | 0.40 | 0.57 | |
0.76 | 0.69 | 0.78 | 0.81 | |
Oliveto & Hager (2002) | 25.01 | 101.25 | 112.98 | 95.25 |
0.56 | 0.61 | 0.70 | 0.87 | |
0.50 | 0.34 | 0.21 | 0.32 | |
Lanca et al. (2013) | 44.64 | 119.67 | 125.76 | 98.16 |
0.66 | 0.81 | 0.76 | 0.89 | |
0.47 | 0.32 | 0.20 | 0.31 | |
Franzetti et al. (2022) | 93.55 | 152.58 | 176.15 | 215.28 |
0.40 | 0.51 | 0.75 | 1.03 | |
0.39 | 0.29 | 0.22 | 0.23 | |
Tang et al. (2023) | 115.89 | 168.52 | 250.59 | 205.85 |
1.14 | 1.58 | 2.65 | 1.98 | |
0.19 | 0.23 | 0.16 | 0.18 |
Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.
Calculation of the errors in various approaches for determining ds/y within the selected ranges of V/Vc
Different approaches . | V/Vc ≤ 0.25 . | 0.25 < V/Vc ≤ 0.50 . | 0.50 < V/Vc ≤ 0.75 . | 0.75 < V/Vc ≤ 1.0 . |
---|---|---|---|---|
XGBoost (present model) | 26.68 | 10.82 | 12.14 | 11.48 |
0.18 | 0.13 | 0.23 | 0.31 | |
0.86 | 0.91 | 0.93 | 0.82 | |
PSO–SVM (present model) | 56.75 | 29.74 | 39.10 | 30.55 |
0.27 | 0.50 | 0.32 | 0.68 | |
0.81 | 0.85 | 0.91 | 0.87 | |
Oliveto & Hager (2002) | 30.01 | 81.00 | 135.58 | 114.30 |
0.45 | 0.73 | 0.56 | 0.70 | |
0.60 | 0.27 | 0.25 | 0.26 | |
Lanca et al. (2013) | 53.57 | 95.74 | 150.91 | 117.79 |
0.79 | 0.65 | 0.61 | 1.07 | |
0.38 | 0.38 | 0.24 | 0.25 | |
Franzetti et al. (2022) | 74.84 | 122.06 | 211.38 | 172.22 |
0.52 | 0.61 | 0.60 | 1.24 | |
0.31 | 0.23 | 0.26 | 0.28 | |
Tang et al. (2023) | 92.71 | 134.82 | 300.71 | 247.02 |
1.37 | 1.90 | 3.18 | 2.38 | |
0.23 | 0.18 | 0.13 | 0.22 |
Different approaches . | V/Vc ≤ 0.25 . | 0.25 < V/Vc ≤ 0.50 . | 0.50 < V/Vc ≤ 0.75 . | 0.75 < V/Vc ≤ 1.0 . |
---|---|---|---|---|
XGBoost (present model) | 26.68 | 10.82 | 12.14 | 11.48 |
0.18 | 0.13 | 0.23 | 0.31 | |
0.86 | 0.91 | 0.93 | 0.82 | |
PSO–SVM (present model) | 56.75 | 29.74 | 39.10 | 30.55 |
0.27 | 0.50 | 0.32 | 0.68 | |
0.81 | 0.85 | 0.91 | 0.87 | |
Oliveto & Hager (2002) | 30.01 | 81.00 | 135.58 | 114.30 |
0.45 | 0.73 | 0.56 | 0.70 | |
0.60 | 0.27 | 0.25 | 0.26 | |
Lanca et al. (2013) | 53.57 | 95.74 | 150.91 | 117.79 |
0.79 | 0.65 | 0.61 | 1.07 | |
0.38 | 0.38 | 0.24 | 0.25 | |
Franzetti et al. (2022) | 74.84 | 122.06 | 211.38 | 172.22 |
0.52 | 0.61 | 0.60 | 1.24 | |
0.31 | 0.23 | 0.26 | 0.28 | |
Tang et al. (2023) | 92.71 | 134.82 | 300.71 | 247.02 |
1.37 | 1.90 | 3.18 | 2.38 | |
0.23 | 0.18 | 0.13 | 0.22 |
Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.
Calculation of the errors in various approaches for determining ds/y within the selected ranges of Fr
Different approaches . | Fr ≤ 0.2 . | 0.2 < Fr ≤ 0.4 . | 0.4 < Fr ≤ 0.6 . | 0.6 < Fr ≤ 1.0 . |
---|---|---|---|---|
XGBoost (present model) | 18.45 | 17.17 | 19.28 | 18.22 |
0.29 | 0.14 | 0.16 | 0.22 | |
0.88 | 0.94 | 0.89 | 0.89 | |
PSO–SVM (present model) | 60.06 | 30.85 | 27.04 | 32.33 |
0.28 | 0.53 | 0.51 | 0.47 | |
0.87 | 0.77 | 0.91 | 0.77 | |
Oliveto & Hager (2002) | 31.76 | 84.04 | 93.77 | 120.97 |
0.46 | 0.77 | 0.58 | 0.72 | |
0.42 | 0.28 | 0.27 | 0.27 | |
Lanca et al. (2013) | 37.05 | 99.33 | 159.72 | 81.47 |
0.55 | 1.03 | 0.63 | 1.13 | |
0.39 | 0.27 | 0.25 | 0.39 | |
Franzetti et al. (2022) | 77.65 | 193.78 | 146.20 | 273.41 |
0.33 | 0.65 | 0.62 | 0.85 | |
0.50 | 0.37 | 0.28 | 0.29 | |
Tang et al. (2023) | 96.19 | 214.02 | 207.99 | 261.43 |
1.45 | 1.31 | 2.20 | 2.51 | |
0.24 | 0.19 | 0.20 | 0.15 |
Different approaches . | Fr ≤ 0.2 . | 0.2 < Fr ≤ 0.4 . | 0.4 < Fr ≤ 0.6 . | 0.6 < Fr ≤ 1.0 . |
---|---|---|---|---|
XGBoost (present model) | 18.45 | 17.17 | 19.28 | 18.22 |
0.29 | 0.14 | 0.16 | 0.22 | |
0.88 | 0.94 | 0.89 | 0.89 | |
PSO–SVM (present model) | 60.06 | 30.85 | 27.04 | 32.33 |
0.28 | 0.53 | 0.51 | 0.47 | |
0.87 | 0.77 | 0.91 | 0.77 | |
Oliveto & Hager (2002) | 31.76 | 84.04 | 93.77 | 120.97 |
0.46 | 0.77 | 0.58 | 0.72 | |
0.42 | 0.28 | 0.27 | 0.27 | |
Lanca et al. (2013) | 37.05 | 99.33 | 159.72 | 81.47 |
0.55 | 1.03 | 0.63 | 1.13 | |
0.39 | 0.27 | 0.25 | 0.39 | |
Franzetti et al. (2022) | 77.65 | 193.78 | 146.20 | 273.41 |
0.33 | 0.65 | 0.62 | 0.85 | |
0.50 | 0.37 | 0.28 | 0.29 | |
Tang et al. (2023) | 96.19 | 214.02 | 207.99 | 261.43 |
1.45 | 1.31 | 2.20 | 2.51 | |
0.24 | 0.19 | 0.20 | 0.15 |
Note. Three values presented in each cell represent MAPE, RMSE, and R2, respectively.
A Taylor diagram illustrating the comparison of SDR predictions using XGBoost and SVM–PSO ML approaches and the previous empirical model.
A Taylor diagram illustrating the comparison of SDR predictions using XGBoost and SVM–PSO ML approaches and the previous empirical model.
A violin plot depicting the distribution of SDR values across various scour depth prediction models.
A violin plot depicting the distribution of SDR values across various scour depth prediction models.
CONCLUSION
In the present study, 501 datasets for clear water scouring (CWS) under temporal conditions were collected, and 75%, i.e., 325 datasets, and 25%, i.e., 126 datasets, were divided for training and testing purposes, respectively. The following conclusion has been drawn from the present study:
The GT shows that the highly significant input parameters to predict the SDR (ds/y) around the bridge pier are pier b, y, V, Vc, d50, σg, and t so it is recommended to use these input parameters during modelling of temporal clear water scour depth under a given range of datasets.
For CWS, it was found that XGBoost (present model) predicted better the SDR (ds/y) than other combinations by showing an R2 value of more than 0.96 and MAPE value of less than 13.0% and RMSE less than 0.30 than SVM–PSO (present model), hence it was concluded that the XGBoost ML model better predicted SDR under clear water scouring conditions for unsteady flow.
For existing previous empirical models under the CWS condition, it is found that the empirical model by Oliveto & Hager (2002) predicted better SDR values up to 1.6 than the models by Lanca et al. (2013), Franzetti et al. (2022), and Tang et al. (2023). Hence, the empirical model by Oliveto & Hager (2002) can be utilized to predict SDR value under temporal condition when the SDR value is up to 1.6.
For all the 0.5 ≤ /y ≤ 1.5, 0.5 < V/Vc < 0.7, and 0.2 < Fr ≤ 0.4 ranges, the XGBoost ML approaches typically depicted better results for selected dataset ranges than the SVM–PSO model and the other existing scour depth predictive models, along with a showing higher R2 value, i.e., 0.82 and lower MAPE and RMSE value, i.e., 26.68 and 0.31%, respectively.
The effectiveness of empirical approaches is often compromised by the diversity of datasets gathered from various researchers, resulting in inconsistencies and potential differences. Furthermore, the restricted applicability of empirical formulas within certain parameters impedes accurate prediction and broader application.
It is important to note that the present ML approach is limited by the diversity of datasets utilized in the modelling process. Improved results can be achieved when the input parameter values align with the specified ranges in Table 1. In order to improve the reliability of the model, a wider range of field-based temporal clear water scouring datasets is essential to collect, considering the influence of different non-circular bridge pier shapes.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare that there are no conflicts of interest.