Abstract
Most bridge failures occur due to the development of scour holes around the abutment and pier. Therefore, accurate prediction of abutment scour depth is critical for designing and maintaining bridges to ensure their safety and longevity. Traditional methods for predicting abutment scour depth, such as empirical formulas and physical models, have accuracy, applicability, and cost limitations. Machine learning (ML), on the other hand, has the potential to overcome these limitations by leveraging large amounts of data and identifying complex patterns and relationships that are difficult to detect using traditional methods. ML models can be trained on various data sources, including field measurements, laboratory experiments, and numerical simulations, to predict abutment scour depth accurately. Therefore, the present study aims to develop a novel-tuned Custom ensemble ML model for predicting abutment scour depth in clear-water conditions. The proposed Custom ensemble model outperforms the ML models used to predict non-dimensional scour depth at abutments with an accuracy of 95.93%.
HIGHLIGHTS
Develop a novel machine learning model that understands the physics involved in abutment scour depth and accurately predicts abutment scour depth at abutments.
The proposed Custom ensemble model outperforms the machine learning models used to predict non-dimensional scour depth at abutments with an accuracy of 95.93.
ABBREVIATIONS
- AdaBoost
Adaptive boosting
- ANFIS
Adaptive-network-based fuzzy inference system
- ANN
Artificial neural network
- BA-Kstar
Bagging-Kstar
- CPU
Central processing unit
- DA-Kstar
Dagging-Kstar
- DT
Decision tree
- GBDT
Gradient-based decision tree
- GEP
Gene-expression programming
- GMDH
Group method of data handling
- GridSearchCV
GridSearch cross-validation
- HWRE
Hydraulic and Water Resource Engineering Laboratory
- LightGBM
Light gradient boosting machine
- MAE
Mean absolute error
- ML
Machine learning
- RC-Kstar
Random Committee-Kstar
- RS-Kstar
Random Subspace-Kstar
- RMSE
Root mean square error
- STD
Standard deviation
- SVM
Support vector machine
- WIHW-Kstar
Weighted Instance Handler Wrapper-Kstar
- XGBoost
Extreme gradient boosting
INTRODUCTION
Bridges are one of the most significant structures that humankind has constructed for safe transportation purposes. The bridge structure provides safe passage over the river to link two distinct destinations. These structures built on waterways are essential not only to humankind but also for the economic growth of a country. The expense involved in the construction of a bridge is enormous, and the consequences caused by its failure are irreparable (Gazi et al. 2019; Kumar & Afzal 2022). Most bridge failures occur mainly due to scour development around the abutment and pier (Afzal et al. 2020; Gautam et al. 2021). In scour phenomenon, sediment particles are removed from the riverbed around bridge abutments or piers. The scour development weakens the foundation, resulting in the bridge's collapse. Therefore, scouring around bridge abutments has garnered increased attention from researchers during the last several decades.
Various notable studies have been performed on local scour depth around bridge abutments (Laursen & Toch 1956; Wong 1982; Froehlich 1989; Dey & Barbhuiya 2004; Fael et al. 2006; Abou-Seida et al. 2012). First, Laursen & Toch (1956) conducted a set of experimental studies to examine scour around bridge piers and abutments. However, they could not describe the flow field behavior at the bridge pier and abutment due to the lack of the necessary instrument. Melville (1992) published a laboratory dataset of scour around bridge abutments with varying flow depth, abutment geometry, and alignments. They also investigated the influence of sediment characteristics on scour depth at abutments. Kwan & Melville (1994) performed an experimental investigation on scour at abutments and reported that the flow structures are dominated by a large primary vortex and its associated downflow. They also identified a secondary vortex, with a counter-rotational direction to that of the primary vortex, occurring next to the primary vortex.
Dey & Barbhuiya (2004) conducted experiments for local scour on the vertical wall, 45° wing-wall, and semi-circular abutments and determined equations of maximum equilibrium scour depth under clear-water scour conditions. Fael et al. (2006) performed vertical-wall abutment scour under clear-water conditions. Using regression analysis, they gave the morphology of the scour area in terms of volume and plan dimensions. Abou-Seida et al. (2012) developed equations to predict equilibrium scour patterns around vertical bridge abutments in cohesive soil. They also presented an equation to predict the development of scour depth with time. Barbhuiya & Mazumder (2014) performed local scour experiments using four uniform cohesionless sediment diameters and five vertical-wall abutments. They proposed an equation to calculate the scour depth values. Recently, Singh et al. (2020) studied experimental results of clear-water scour on a sand bed under short contractions. They proposed two analytical equations to calculate time-dependent scour depth and maximum scour at equilibrium conditions. Based on the above-discussed literature, it can be concluded that experimental techniques are expensive and time-consuming. Also, the experimental setup may be less complicated than the actual circumstance, and therefore the generated regressive equation may fail to operate in a real-world context. Several researchers also performed numerical investigations for sediment transport and scour phenomenon (Afzal 2013; Afzal et al. 2015, 2020, 2021; Ahmad et al. 2015; Gautam et al. 2021). However, numerical simulation also requires high cost and time consumption. Thus, machine learning (ML) tools have evolved recently and can be used as an alternative, more reliable, and accurate tool, requiring less money and time consumption.
The ML approaches are user-friendly, precise, and accurately interpret missing data. With the advancement of ML's predictive capabilities, researchers increasingly use ML instead of conventional experimental methods (Dutta et al. 2020; Kumar et al. 2020, 2022). Muzzammil (2008) predicted scour depth at abutments using an artificial neural network (ANN). He discovered that the ANN approach outperformed traditional empirical equations. He also concluded that predictions based on raw data (dimensional) are superior to those found on non-dimensional characteristics. Further, Muzzammil (2010) extended his work and used an adaptive-network-based fuzzy inference system (ANFIS) for scour depth prediction at abutments. He observed that the ANFIS model performs better than the ANN and conventional regression models. Azamathulla et al. (2010) estimated the abutment scour depth using gene-expression programming (GEP). He found that the GEP technique outperforms ANN and other traditional models.
Najafzadeh et al. (2013a, 2013b) used the group method of data handling (GMDH) to estimate abutment scour depth in cohesive soils and clear-water and live-bed situations. They found that the GMDH technique outperformed all other traditional models, including the support vector machine (SVM). Azimi et al. (2017, 2019) introduced an advanced ANFIS model called the Pareto-evolutionary structure of the ANFIS network, which outperformed the traditional ANFIS model. Parsaie et al. (2019) compared SVM, ANN, and ANFIS models and found that SVM had the best performance. Ebtehaj et al. (2018) used the extreme learning machine algorithm and showed that it had faster training and better predictive ability than ANN and SVM. Bonakdari et al. (2020) also used the extreme learning machine (ELM) method with four input parameters to predict scour depth in clear-water situations. Pandey et al. (2020) employed genetic algorithms to predict maximum scour depth and found that it outperformed multiple linear regression. Metaheuristic optimization algorithms such as grasshopper optimization algorithms (Kaveh et al. 2021) and firefly algorithms (Kohansarbaz et al. 2021) have been integrated with ANN and ANFIS to achieve higher prediction accuracy. Khosravi et al. (2021) used the Kstar model with five innovative hybrid algorithms of bagging (BA-Kstar), dagging (DA-Kstar), random committee (RC-Kstar), random subspace (RS-Kstar), and weighted instance handler wrapper to estimate scour depth for clear-water conditions (WIHW-Kstar). They reported that the RC-Kstar model outperformed other models for scour depth prediction around semi-circular and 45° wings. In contrast, the WIHW-Kstar model had the maximum performance in scour depth prediction around vertical abutment shape. Recently, Xu et al. (2023) reviewed the use of ML tools in coastal bridge hydrodynamics.
The existing literature highlights the extensive use of ML approaches for predicting abutment scour depth in clear-water conditions. Therefore, this study aims to develop a new Custom ensemble model for the same purpose. The effectiveness of the developed model is evaluated by comparing it with various established models, such as decision tree (DT), AdaBoost, XGBoost, LightGBM, and the Muzzammil (2010) empirical equation. This study presents a new and improved approach to predicting abutment scour depth in clear-water conditions, utilizing a novel-tuned Custom ensemble ML model that has not been developed before.
MATERIALS AND METHODS
Dataset collection
The datasets of abutment scour under clear-water conditions were collected from the experimental study of Dey & Barbhuiya (2004). The dataset includes 297 runs conducted at the Hydraulic and Water Resource Engineering Laboratory (HWRE) at the Indian Institute of Technology Kharagpur, India. They used a flume of dimensions 20 m long, 0.9 m wide, and 0.7 m deep to investigate scour at abutments under clear-water conditions. They used three different abutment geometry: vertical wall, 45° wing-wall, and semi-circular wall abutment of various sizes. The abutments were attached to the side wall of the flume and embedded in a sediment bed of 0.3 m deep. They performed these experiments using sand of d50 ranging from 0.26 to 3.18 mm.
Dimensional analysis
Statistical analysis
Using appropriate input parameters in ML models that match laboratory conditions is essential to achieve optimal results. To develop these techniques for predicting scour at abutment under clear-water conditions, a total of 297 datasets were collected from Dey & Barbhuiya (2004) literature. Further, based on dimensional analysis, the non-dimensional parameters ds/l, d50/l, d50/h, h/l, and Fe datasets from Dey & Barbhuiya (2004) were prepared. The statistical analysis of the dataset collected from Dey & Barbhuiya (2004) shows mean, maximum, minimum, standard deviation, kurtosis, and skewness value shown in Table 1.
Parameter . | Min . | Max . | Mean . | Std. deviation . | Skewness . | Kurtosis . |
---|---|---|---|---|---|---|
ds/l | 0.64615 | 4.35 | 1.80671 | 0.661819 | 0.846851 | 0.5696 |
d50/l | 0.002 | 0.0775 | 0.01361 | 0.013084 | 2.147396 | 5.6943 |
d50/h | 0.00104 | 0.02 | 0.00652 | 0.005211 | 1.115402 | 0.24257 |
h/l | 0.38462 | 6.25 | 2.38072 | 1.406865 | 1.02322 | 0.58423 |
Fe | 0.05588 | 0.39442 | 0.13754 | 0.057534 | 1.369378 | 2.11166 |
Parameter . | Min . | Max . | Mean . | Std. deviation . | Skewness . | Kurtosis . |
---|---|---|---|---|---|---|
ds/l | 0.64615 | 4.35 | 1.80671 | 0.661819 | 0.846851 | 0.5696 |
d50/l | 0.002 | 0.0775 | 0.01361 | 0.013084 | 2.147396 | 5.6943 |
d50/h | 0.00104 | 0.02 | 0.00652 | 0.005211 | 1.115402 | 0.24257 |
h/l | 0.38462 | 6.25 | 2.38072 | 1.406865 | 1.02322 | 0.58423 |
Fe | 0.05588 | 0.39442 | 0.13754 | 0.057534 | 1.369378 | 2.11166 |
Table 1 presents five parameters such as ds/l, d50/l, d50/h, h/l, and Fe. Each parameter has a minimum, maximum, mean, and standard deviation (Ahani et al. 2020a, 2020b; Ahani et al. 2022). The skewness and kurtosis values also provide insights into the shape of the distributions. The ds/l, h/l, and Fe parameters are positively skewed, indicating more values on the lower end of their ranges. The d50/l and d50/h parameters have heavily skewed distributions to the right. The ds/l and h/l parameters have less kurtosis than the normal distribution. In comparison, d50/l and Fe have more kurtosis than the normal distribution. However, the d50/h parameter has the lowest kurtosis value, meaning fewer outliers than a normal distribution.
Methodology
ML models
The present study has used ensemble-based models to predict abutment scour depth. DT, Adaptive boosting regressor (AdaBoost), Extreme Gradient Boosting or (XGBoost), Light gradient boosting machine (LightGBM), and a developed Custom ensemble model are developed to predict abutment scour depth.
Decision tree
Adaptive boosting regressor
The AdaBoost technique is prone to overfitting for noiseless datasets, which suggests that the present model would be robust to overfitting due to a noiseless dataset containing relevant input features.
Extreme gradient boosting
However, XGboost differs from GBDT techniques in specific ways. First, the GBDT technique only uses the first-order Taylor expansion, whereas XGboost extends the loss function with a second-order Taylor expansion. Second, normalization is used in the target function to avoid overfitting and lower the model's complexity. It is a method of selection that includes both embedded and filtered features.
Light gradient boosting machine
LightGBM technique is based on GBDT algorithms, which combine weak and strong learners to form a strong learner. The GBDT method uses a DT that is slightly different from a standard DT. The previous trees' results and residuals are recorded here and utilized in the following learning stage. The final output is calculated by combining the findings of multiple trees (Friedman 2001). The GBDT is extensively used worldwide due to its prediction capabilities. However, its accuracy efficiency has recently suffered from the massive data increase. The benefit of implementing LightGBM Regressor is that it enhances forecasting performance while also lowering memory usage without sacrificing prediction power. Unlike standard GBDT, it also employs a better histogram method. The algorithm development of a DT requires more computing time than traditional GBDT. The developed DT is utilized to locate the optimal segmentation point. The basic idea is to sort feature values and enumerate all accessible feature points, which uses a lot of memory and high computation time. The continuous eigenvalues are partitioned into k intervals, with k values chosen as division points. The LightGBM algorithm employs the leaf-wise generation approach to decrease the training data. Compared to other methods, such as depth or level-wise traditional techniques, the leaf wise can reduce losses when growing the same leaf. Moreover, an additional parameter is employed to restrict the depth of the DT, preventing overfitting.
Custom ensemble model
Further, the weights were estimated by running the Voting Regressor in a loop with different weights assigned to each predictor, and the corresponding R2 scores were calculated. The combination that produced the highest score was selected, resulting in a weight of 4 for Gradient Boosting and 1 for AdaBoost, ANN, and XGBoost models. The dataset was fitted to the ensemble model with these weights, and predictions were obtained.
The Ensemble ML models are used in the present study to predict abutment scour in clear-water conditions because they offer several advantages over traditional ML models (ANN and SVM). Ensemble models combine multiple models, making them more robust and reducing the risk of overfitting. They can also handle complex data sets and provide more accurate predictions than a single model. In addition, ensemble models can incorporate different algorithms and techniques, such as DT, random forests, and gradient boosting, to leverage the strengths of each model and improve the overall prediction accuracy. Furthermore, ensemble models can identify the essential features that contribute to the prediction of abutment scour, which can be used to enhance the design and maintenance of hydraulic and coastal structures. The Ensemble ML models have been widely used in hydraulics and coastal engineering to predict several hydraulics and coastal engineering parameters. These models can provide valuable insights into the behavior of hydraulic and coastal systems, informing the safe design and operation of coastal and hydraulic structures.
Tuning of Ml models
The ML models in the present study predicted the non-dimensional abutment scour depth with low accuracy with their default hyperparameters. Thus, tuning the hyperparameters was required to improve accuracy by boosting their performance. All the models were tuned to get the best hyperparameters. Tuning of the models is performed using the GridSearchCV function. A set of hyperparameters are sent to the function. This is done by passing a dictionary containing several values of the hyperparameters. The function then tries all combinations of the hyperparameters by cross-validation and checks the performance of each set of hyperparameters by fitting them to the model. Finally, the set of values of hyperparameters, which yields the best performance, is selected. The default and the tuned hyperparameters are presented in Table 2.
Models . | Default hyperparameters . | Tuned hyperparameters . |
---|---|---|
DT | criterion = ‘ squared_error' | criterion = 'friedman_mse’ |
AdaBoost | loss = 'linear’ n_estimators = 50 | loss = 'square’ n_estimators = 100 |
XGBoost | learning_rate = 0.3 max_depth = 6 min_child_weight = 1 | learning_rate = 0.1 max_depth = 4 min_child_weight = 2 |
Models . | Default hyperparameters . | Tuned hyperparameters . |
---|---|---|
DT | criterion = ‘ squared_error' | criterion = 'friedman_mse’ |
AdaBoost | loss = 'linear’ n_estimators = 50 | loss = 'square’ n_estimators = 100 |
XGBoost | learning_rate = 0.3 max_depth = 6 min_child_weight = 1 | learning_rate = 0.1 max_depth = 4 min_child_weight = 2 |
Model performance assessment
RESULTS AND DISCUSSION
The tree-based ML techniques such as DT, Adaboost, XGBoost, and LightGBM and Custom ensemble model were developed to estimate abutment scour depth in clear-water conditions. The ML models' performance in predicting the abutment scour depth was examined using the dataset of Dey & Barbhuiya (2004).
Sensitivity analysis
Abutment scour depth prediction using ML techniques
The study compared the performance of several ML models in predicting abutment scour depth in clear water conditions. The results showed that the novel-tuned Custom ensemble model outperformed other models in terms of accuracy. The DT, AdaBoost, XGBoost, and LightGBM models offer adequate abutment scour prediction accuracy. The scatter plots indicate that the novel-tuned Custom ensemble and DT models predict the abutment scour depth with the least and maximum scatter, respectively. The Custom ensemble model scatter plot also showed that bias and slope are close to 0 and 1, respectively. The developed Custom ensemble model showed the highest coefficient of determination (R2 = 0.9593). However, the DT has the lowest coefficient of determination (R2 = 0.8963) for predicting abutment scour depth in clear water conditions. The other ensemble model also provides comparable results to the developed Custom ensemble model. The XGBoost and AdaBoost model predicts abutment scour depth with a coefficient of determination of 0.9391 and 0.9355, respectively, followed by the LightGBM model, which predicts abutment scour depth with a coefficient of determination of 0.9291.
Abutment scour depth estimation using empirical formulations
It can be observed that the Muzzammil (2010) formulation predicted value of abutment scour depth shows good agreement with the experimental results. The Muzzammil (2010) formulation predicts abutment scour depth with an accuracy of 0.7279, lower than all the ensemble models used in the present study. The scatter plot of Muzzammil (2010) formulation also shows that maximum scatter in comparison to other existing ML model
Comparison of ML models with Muzzammil (2010) formulation
The performance metrics of ML techniques and Muzzammil's (2010) formulation are compared to each other for training and testing datasets, shown in Table 3.
Model . | Training . | Testing . | ||||
---|---|---|---|---|---|---|
R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | |
DT | 0.9644 | 0.1247 | 0.0718 | 0.8963 | 0.2099 | 0.1497 |
AdaBoost | 0.9533 | 0.1429 | 0.1216 | 0.9355 | 0.1656 | 0.1434 |
XGBoost | 0.9679 | 0.1185 | 0.0888 | 0.9391 | 0.1609 | 0.1306 |
LightGBM | 0.9721 | 0.1105 | 0.0801 | 0.9291 | 0.1735 | 0.1319 |
Custom ensemble | 0.9898 | 0.0668 | 0.0536 | 0.9593 | 0.1318 | 0.1073 |
Muzzammil (2010) | 0.7605 | 0.3236 | 0.2576 | 0.7229 | 0.3432 | 0.2761 |
Model . | Training . | Testing . | ||||
---|---|---|---|---|---|---|
R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | |
DT | 0.9644 | 0.1247 | 0.0718 | 0.8963 | 0.2099 | 0.1497 |
AdaBoost | 0.9533 | 0.1429 | 0.1216 | 0.9355 | 0.1656 | 0.1434 |
XGBoost | 0.9679 | 0.1185 | 0.0888 | 0.9391 | 0.1609 | 0.1306 |
LightGBM | 0.9721 | 0.1105 | 0.0801 | 0.9291 | 0.1735 | 0.1319 |
Custom ensemble | 0.9898 | 0.0668 | 0.0536 | 0.9593 | 0.1318 | 0.1073 |
Muzzammil (2010) | 0.7605 | 0.3236 | 0.2576 | 0.7229 | 0.3432 | 0.2761 |
Note: Bold indicates the best performance for each model.
Table 3 provides a comparison of different ML models' performance metrics in predicting abutment scour depth. The models' performance is evaluated based on two evaluation metrics: coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE) for both training and testing datasets. The DT, AdaBoost, XGBoost, LightGBM, Custom ensemble, and Muzzammil (2010) prediction results show that the novel Custom ensemble model achieved the best performance with the highest R2 value of 0.9898, lowest RMSE value of 0.0668, and lowest MAE value of 0.0536 for the training dataset. For the testing dataset, the novel Custom ensemble model also achieved the highest R2 value of 0.9593, lowest RMSE value of 0.1318, and lowest MAE value of 0.1073. These results indicate that the Custom ensemble model has the best predictive performance for abutment scour depth compared to the other models.
The DT, XGBoost, and LightGBM models also performed well in predicting abutment scour depth and were comparable to the results of the novel Custom ensemble model. However, the AdaBoost model's performance was slightly lower than the other models, with R2 values below 0.95 for both the training and testing datasets. The Muzzammil (2010) model performed the worst among all the models, with the lowest R2, highest RMSE, and highest MAE values for both the training and testing datasets. This indicates that the ML models outperform the traditional model in predicting abutment scour depth.
The abutment scour dataset had been divided into two parts to validate the ML model. The first set (70% of the data) was used to train the ML model. After that, the remaining 30% of the abutment scour dataset was used as validation data against the predictions done by the developed ML model. This is a standard practice in ML (i.e., testing), considered the same as validation of the numerical model (Afzal et al. 2023). Further, the results are compared with Muzzammil (2010) empirical formulation, which found that the ML model outperforms Muzzammil (2010) empirical formulation with higher accuracy.
CONCLUSION
The bridge piers and abutments often interact with approaching flow and cause scour around it that may lead to the failure of the bridge structure. Several experimental investigations are performed to estimate bridge abutments. However, the accuracy of the estimated abutment scour depth may be affected by introducing assumptions in the experimental study. The empirical formulation developed using conventional regression methods has limitations and may not accurately capture the scour phenomenon's complex nature. Therefore, the ML approach is introduced to determine accurate abutment scour depth estimation in clear water conditions.
This study uses DT, Adaboost, XGboost, LightGBM, and a novel Custom ensemble technique to estimate abutment scour depth. The performance metrics results of Ensemble models were compared with a single DT and the Muzzammil (2010) formulations. All the ensemble models used in the present study predict with higher accuracy than a single DT and the Muzzammil (2010) formulation. The highest value of R2 (0.9593) for the testing purpose signifies that the developed Custom ensemble model outperforms other models used in the present study. The Custom ensemble model also has the least RMSE and MAE value of 0.1318 and 0.1073, respectively. Therefore, it can be concluded that the Custom ensemble model and ML algorithms can be used as reliable design tools for predicting abutment scour depth in clear water conditions.
Using a Custom ensemble ML model to predict abutment scour depth has multiple advantages, such as improved accuracy, robustness to data variations, faster analysis, identification of potential risks, and incorporation of multiple data sources for scalability. This can lead to improved public safety, reduced infrastructure failures, and proactive measures to prevent damage.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.