## ABSTRACT

Forecasting the time-dependent scour depth (*d _{st}*) is very important for the protection of bridge structures. Since scour is the result of a complicated interaction between structure, sediment, and flow velocity, empirical equations cannot guarantee an advanced accuracy, although they would preserve the merit of being straightforward and physically inspiring. In this article, we propose three ensemble machine learning methods to forecast the time-dependent scour depth at piers: extreme gradient boosting regressor (XGBR), random forest regressor (RFR), and extra trees regressor (ETR). These models predict the scour depth at a given time,

*d*, based on the following main variables: the median grain size,

_{st}*d*

_{50}, the sediment gradation,

*σ*

_{g}, the approach flow velocity,

*U*, the approach flow depth

*y*, the pier diameter

*D*

_{p}, and the time

*t*. A total of 555 data points from different studies have been taken for this research work. The results indicate that all the proposed models precisely estimate the time-dependent scour depth. However, the XGBR method performs better than the other methods with

*R*= 0.97, NSE = 0.93, AI = 0.98, and CRMSE = 0.09 at the testing stage. Sensitivity analysis exhibits that the time-dependent scour depth is highly influenced by the time scale.

## HIGHLIGHTS

A precise estimate of the scour depth fosters the structural integrity of a bridge.

Novel ensemble machine learning approach to forecast the time-dependent scour depth at piers is presented.

Non-dimensional parameters were used as the independent variables.

Excellent results were obtained for the forecast based on literature found real-world scenarios.

## INTRODUCTION

Many bridges have been built to facilitate the swift circulation of people and supplies, which contributes to the economic development of an area. Approximately 575,000 bridges are built over waterways in the United States (Richardson & Lagasse 1996). However, an unfortunate bridge collapse is a disastrous event, leading to severe property damage and immense loss of life. A bridge pier acts as a substructure, which supports the superstructure. One of the main causes of numerous bridge collapses worldwide is scouring around the bridge pier. The bridge pier causes a change in the stream flow patterns, which leads to the vortex formation around the pier. The process results in the formation of scour holes as a result of the stream bed's sediment being cleared.

Due to the extremely high construction and maintenance costs, the impact of the scouring phenomenon around bridge piers has always been a major problem for the safe and economical design of a bridge. A precise estimation of scour depth is necessary for the secure design of any bridge. The scouring process is the activity of fluid flow that removes and carries out bed material from the bed and foundation of any hydraulic construction. Generally, scour happens during a high flow event. Flooding leads to an increase in the approach velocity of the flow, which can be higher than the critical velocity of bed sediment and fundamental base material. The increased approach flow velocity may result in a higher sediment entrainment and the consequent erosion/deposition of the bed material may lead to scouring around bridge piers. Sometimes scour depths can reach up to 4.5–6 m during extreme hydrologic events (Sturm *et al.* 2016).

The maximum scour depth and rate of its development depends on the bed material around the base of the bridge. It takes different time durations to reach maximum scour depth with different bed materials. Sand and gravel bed materials take approximately hours to reach maximum scour depth. Cohesive bed materials take days to reach maximum scour depth. Sandstones, shales, and glacial tills require months. It takes years for limestones and centuries for dense granite to reach the equilibrium depth of scour (Arneson *et al.* 2012). The bridge pier obstructs the flow of a stream, causing the velocity to decrease to zero in certain areas. This results in a downward pressure gradient and downward flow of water, which cause erosion of the riverbed and creates scour holes. The primary vortex is the horseshoe vortex, which forms as a result of this downflow. As a result of flow separation, a secondary vortex known as the wake vortex forms downstream of the pier.

The relationship between a scour hole development and time taken in the process must be understood to prevent the failure of a bridge. Figure 2 illustrates how the depth of scour develops over time. During the initial stages of scouring, the scour depth rapidly increases with time (*t*_{1}). As the process enters the intermediate stages, the scour depth gradually increases with time (*t*_{2}). In the end, the scour depth reaches an equilibrium stage at time *t*_{e}, where it does not vary with time.

Experimental investigations on scour depth around bridge piers were started in the early 1950s. Many researchers performed experiments and proposed empirical equations for ascertaining scour depth prediction. Most of the researchers used dimensional analysis and assumed simplified conditions like uniform and non-cohesive bed materials, constant flow depth, and steady clear-water flow conditions to study the scouring phenomenon. Furthermore, a number of researchers found that using field scour data could improve the forecast of scour depth accuracy. However, due to complexities in the natural river flow, these methods could not precisely predict the equilibrium scour depth. Artificial intelligence has become a popular tool in recent years to help solve the issue with theoretical and experimental estimations. Many machine learning (ML) methods have been deployed to solve many problems like the Froude number prediction, rainfall-runoff prediction, sediment transport, and scour depth prediction (Shakya *et al.* 2022a, 2022b; Zhai *et al.* 2022). Numerous investigators have conducted investigations to predict the scour depth during the equilibrium stage in stable clear-water environments (Uyumaz *et al.* 2006; Yang *et al.* 2020). Predicting the local scour depth has also been the subject of numerous investigations (Melville 1975; Yanmaz & Altinbilek 1991; Nazariha 1996; Euler & Herget 2012; Zhou *et al.* 2020). To the best of our knowledge, however, very few studies have been conducted using both laboratory and field datasets to anticipate the time-dependent scour depth.

Lim & Cheng (1998) introduced a semi-empirical equation for estimating time-averaged equilibrium scour depth at 45° wings wall abutment under live-bed conditions. Oliveto & Hager (2002) provided an empirical equation for temporal scour development that considers the effects of viscosity. Mia & Nago (2003) designed a conditional method to forecast time-dependent local scour at circular bridge pier using the sediment transport theory of Yalin (1977). Ballio *et al.* (2010) examined the time evolution of local scour at a vertical wall abutment through experimentation. Guo *et al.* (2010) provided a semi-empirical equation to forecast time-dependent scour depth from an experimental flume dataset with bridge-submerged flow condition. This equation has the capability to reduce the required depth for bridge scour design, aligning with the design flow and peak flow period. This could result in substantial cost savings in the construction of bridge foundations.

The M5 model tree (MT) has been applied to several dimensional datasets to forecast scour depth and the results have been compared with empirical equations. A closer match was found between the M5 model and the backpropagation neural network (Pal *et al.* 2012). Neerukatti *et al.* (2013) discussed the Gaussian process model with Bayesian uncertainty to forecast scour depth using four different flume datasets under multiple conditions. This Gaussian process model works well with fewer training datasets and equilibrium conditions. Choi & Choi (2016) proposed an empirical equation that helps in prediction of scour depth under uniform bed sediment and negligible bedform effect conditions. Aksoy *et al.* (2017) experimentally studied time-dependent scour depth under clear-water conditions considering circular piers of four different diameters. They obtained a relationship of scour depth with critical velocity, the geometry of the bridge pier, and hydraulic parameters. Ebtehaj *et al.* (2017) introduced scour depth prediction around four different shapes of piers bridge using a self-adaptive extreme learning machine (SAELM) and compared results with artificial neural network (ANN) and support vector regressor (SVR).

Sattar *et al.* (2018) introduced a method for predicting the maximum scour depth downstream of grade control structures (GCS) by leveraging a more comprehensive dataset and utilizing evolutionary gene expression programming (GEP). Pandey *et al.* (2020) studied the effect of collar diameter on time-dependent scour depth and proposed an empirical equation to forecast the scour depth around piers protected by collars. Dang *et al.* (2021) introduced particle swarm optimization (PSO) and firefly algorithms to enhance ANN models to improve the prediction of the scour depths around circular piers at the equilibrium stage. Tao *et al.* (2021) introduced an XGBR and genetic algorithm (GA) for predicting scour depths under the submerged weir.

Khosravi *et al.* (2021) presented a prediction of abutment scour by standalone KStar model with five novel hybrid algorithms of bagging (BA-KStar), dagging (DA-KStar), random committee (RC-KStar), random subspace (RS-KStar), and weighted instance handler wrapper (WIHW-KStar) for clear-water condition. Prediction of the scour depth around non-uniformly spaced pile groups (PGs) using Gaussian process regression (GPR), random forest (RF), M5P, and least-squares (LS) boosting algorithm (Ahmadianfar *et al.* 2021).

Najafzadeh & Sheikhpour (2024) used data-driven and ML models to predict the local scour depth around PGs caused by waves. Models such as multivariate adaptive regression splines (MARS), evolutionary polynomial regression (EPR), and MT were compared using available datasets. MARS was found to be the most effective method for predicting the scour depth of marine structures based on their analysis.

Najafzadeh & Oliveto (2021) explored scour around PGs under steady currents using artificial intelligence models like EPR, GEP, MARS, and M5P. MARS exhibited superior performance in comparison to other models. Sensitivity analysis identified the ratio between the approach flow velocity (*U*) and the flow velocity (*Uc*) at the inception of sediment motion as the most influential parameter, while the ratio between the number of piles inline with the flow (*m*) and the number of piles normal to the flow (*n*) showed an opposite trend in scour depth.

Homaei & Najafzadeh (2020) evaluated the reliability of scouring depth around marine structure piles using Monte-Carlo sampling. They concluded that scouring phenomena should be considered for marine structures when they are subjected to waves with short intervals. This is because there is a greater likelihood that the scour depth around the piles will exceed the safe limit under these conditions.

Najafzadeh *et al.* (2018a) proposed using the neuro-fuzzy group method of data handling (NF-GMDH)-based self-organized models to evaluate pier scour depth under debris effects. They enhanced NF-GMDH networks by ensemble it with GA, PSO, and gravitational search algorithm (GSA). Regarding scour depth prediction, the NF-GMDH-PSO model outperformed the NF-GMDH-GA and NF-GMDH-GSA models.

Najafzadeh *et al.* (2016) used MT-based formulations, EPR, and GEP to precisely predict the depth of pier scour in the presence of debris accumulation. Utilizing parameters such as debris characteristics, flow conditions, bed sediment properties, and bridge pier geometry, laboratory datasets from different literature sources are used for evaluation. Compared with empirical equations, the MT, EPR, and GEP models as explicit equations for scour depth evaluation have been demonstrated to be more practically efficient and have higher accuracy.

From the above literature, we found that the following research gap exists in the current studies on scour depth prediction. In the prediction of time-dependent scour depth, many researchers used either experimental flume data or real data, or a mix of real and experimental data. To the best of the author's knowledge, few studies have applied ensemble ML methods to predict time-dependent scour depth that comprises the dataset collected from various sources. In addition, many studies do not provide the sensitivity analysis that helps to determine the most affecting parameter effecting the time-dependent scour depth.

ML and artificial intelligence are used a lot these days for prediction purposes due to their ability to map complex non-linear patterns. For example, Su *et al.* (2021) was used for analog circuit fault diagnosis, and Yang *et al.* (2022) was used as a new intelligent bearing fault diagnosis model. ML models also help solve complex medical science problems like detecting pre-symptomatic COVID-19 by Cho *et al.* (2022) using ML models. Bisgin *et al.* (2018) used ML to identify food-contaminating beetles. A framework for predicting flood depth with the help of a hybrid of hydraulics and ML by Hosseiny *et al.* (2020). Bui *et al.* (2018) used novel hybrid ML models for spatial prediction of floods. Yousefi *et al.* (2020) used it to assess the susceptibility of schools to flood events in Iran.

Hassan *et al.* (2022) formulated a novel empirical equation and models for predicting the extent of scouring around bridge piers, a critical factor in bridge failure. Three methodologies, namely statistical non-linear regression (NLR), GEP, and ANN, were employed. Among these, the ANN model demonstrated superior accuracy compared with the other two. A sensitivity study revealed that the flow depth had the biggest impact on the scour depth forecasts. Mampitiya *et al.* (2024a) presented a work that enhanced and tailored a forecasting approach for Particulate Matter 10 for predefined areas. After eight models were evaluated in comparison, an ensemble model outperformed seven recognized state-of-the-art models. This study highlights the potential for more projects aimed to predict environmental characteristics particular to a given place, highlighting the importance of ML methods in modern technology.

Mampitiya *et al.* (2024b) examined the effectiveness of eight ML models (Ensemble, LSTM, Bi-LSTM, GRU, ANN, LightGBM, CatBoost, and XGBoost) in predicting particulate matter in two urbanized areas in Sri Lanka. The findings indicate that the Ensemble model outperformed all other models in providing accurate and precise predictions for PM10, making it a recommended model for future investigation in Sri Lanka, a country facing high air pollution levels. Sewer network planners use optimization techniques to control urban wastewater systems, aiming to reduce combined sewer overflows (CSOs) and protect aquatic life. However, capacity limitations and climate variability make control difficult. Rathnayake & Anwar (2019) presented an enhanced optimal control algorithm considering runoff variations and stormwater quality. The algorithm aims to minimize pollution load and wastewater treatment costs. Successfully applied to a Liverpool combined sewer network, the model can handle dynamic control settings and be used in real-time control. These different approaches show that ML algorithms can be used to solve a wide range of prediction problems.

This article makes the following significant contributions: (i) Improve the prediction of time-dependent local scour at piers using ensemble methods: extreme gradient boosting regressor (XGBR), random forest regressor (RFR), and extra trees regressor (ETR). Although few ensemble techniques like bagging regressor (BR) and adaboost regressor (ABR) have been applied for prediction of time-dependent scour depth, to the best of our knowledge, these ensemble methods have not been used for the forecasting of time-dependent scour depth. In addition, the proposed ensemble techniques: XGBR, RFR, and ETR surpass the best performing models of Kumar *et al.* (2023). (ii) The data for this study were taken from three existing datasets (Kothyari 1989; Chang *et al.* 2004; Oliveto & Hager 2005). These datasets consider various factors, such as flow properties around a pier, riverbed properties, bridge pier geometry, and time. These are employed to forecast time-dependent scour depths. The application of these forms can be generalized by converting them to dimensionless forms. (iii) Different statistical criteria are used to compare the efficacy of different ensemble ML methods. We have compared the proposed methodology with the current state-of-the-art methods used on the same dataset as well as empirical equations.

## FRAMEWORK

### Dimensional analysis and functional relationships

Time-dependent scour depth, d_{st}, at bridge piers depends on multiple variables like bed properties, flow properties around the pier, pier geometry, and time. Specifically, these variables include: *D*_{p} is the pier diameter, *U* is the approach flow velocity, *y* is the approach flow depth, *d*_{50} is the sediment median grain size, *σ*_{g} is the sediment gradation, and *t* is the time. For forecasting time-dependent scour depths at bridge piers, several researchers have proposed empirical equations. Some more popular empirical equations are listed in Table 1, where , , and . Moreover, with *g* being the acceleration due to gravity, is the sediment density and is the water density.

Source . | Equation . |
---|---|

Oliveto & Hager (2002) | |

Johnson (1992) | |

Ettema et al. (2011) | |

Richardson & Davis (2001) |

Source . | Equation . |
---|---|

Oliveto & Hager (2002) | |

Johnson (1992) | |

Ettema et al. (2011) | |

Richardson & Davis (2001) |

*n*∈ {1,2,3} and the three-time scale factors , , and defined as:

Based on the functional relationship (1), we formulate three different models to forecast time-dependent scour depth at piers by considering three different time scale factors , , and . Using these three models would make the analysis more comprehensive by trying to explore the more suitable time scale factor.

### Proposed framework

### Dataset collection and preparation

For the dataset of this study, we collected experimental datasets from multiple sources. These include: 46 rows of data collected from Kothyari (1989), 438 rows from Chang *et al.* (2004), and 71 rows from Oliveto & Hager (2005). In this study, we have compiled the dataset from three studies: Kothyari (1989), Chang *et al.* (2004), and Oliveto & Hager (2005). Upon compilation, we have a dataset comprising of 555 rows and six features (columns).

After dataset collection, we will split the dataset into train and test datasets to train our proposed methods and predict test datasets for model efficacy evaluation. Here, we select about 58% of the total data for training purposes, i.e., 24 rows from Kothyari (1989), 233 rows from Chang *et al.* (2004), and 33 rows from Oliveto & Hager (2005). The rest of the dataset is used for testing purposes, which are 42% of the total data (rows 22, 205, and 38 from Kothyari (1989), Chang *et al.* (2004), and Oliveto & Hager (2005), respectively).

In the end, we have the overall dataset split into 290 rows of data for training and 265 rows for testing. Given the uniformity of the sediment, it consists of homogeneous and heterogeneous sediments. For homogeneous sediment, the value of is less than 1.5 (Chiew 1984; Chiew & Melville 1989). Due to the diverse range of the -values employed in this study, they encompass both homogeneous and heterogeneous sediments (see Table 2). Table 2 shows the range of the value of the dimensionless parameter for each dataset source used in this work. A total of six experimental datasets collect from Kothyari (1989), 13 from Chang *et al.* (2004), and six from Oliveto & Hager (2005). All these experiments were conducted with the help of a laboratory flume set with different pier diameters, flow velocities, and various environment setups.

Dataset type . | Parameters . | Kothyari (1989) . | Chang et al. (2004) . | Oliveto & Hager (2005) . | |||
---|---|---|---|---|---|---|---|

Minimum . | Maximum . | Minimum . | Maximum . | Minimum . | Maximum . | ||

Train | 0.332 | 0.948 | 1.496 | 3.012 | 0.023 | 1.705 | |

158.537 | 414.634 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 2.117 | 4.426 | 0.059 | 2.617 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.473 | 4.599 | −0.002 | 4.532 | 2.937 | 4.138 | ||

2.638 | 4.705 | −0.145 | 4.257 | 3.274 | 4.341 | ||

2.265 | 4.473 | 0.03 | 4.432 | 2.335 | 4.017 | ||

0.076 | 0.99 | 0 | 1.41 | 0 | 0.587 | ||

No. of experiments | 3 | 7 | 3 | ||||

No. of rows data | 24 | 233 | 33 | ||||

Serial no. in dataset | 267 | 290 | 34 | 266 | 1 | 33 | |

Test | 0.332 | 1.038 | 1.496 | 3.014 | 0.006 | 1.567 | |

91.549 | 280.488 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 1.597 | 3.314 | 0.042 | 2.342 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.634 | 4.768 | −0.002 | 4.532 | 2.741 | 4.054 | ||

2.792 | 4.709 | −0.145 | 4.257 | 3.032 | 4.942 | ||

2.482 | 4.725 | 0.03 | 4.432 | 2.289 | 3.946 | ||

0.102 | 0.866 | 0 | 1.42 | 0 | 0.557 | ||

No. of experiments | 3 | 6 | 3 | ||||

No. of rows data | 22 | 205 | 38 | ||||

Serial no. in dataset | 244 | 265 | 39 | 243 | 1 | 38 | |

Overall | 0.332 | 1.038 | 1.496 | 3.014 | 0.006 | 1.705 | |

91.549 | 414.634 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 1.597 | 4.426 | 0.042 | 2.617 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.473 | 4.768 | −0.002 | 4.532 | 2.741 | 4.138 | ||

2.638 | 4.709 | −0.145 | 4.257 | 3.032 | 4.942 | ||

2.265 | 4.725 | 0.03 | 4.432 | 2.289 | 4.017 | ||

0.076 | 0.99 | 0 | 1.42 | 0 | 0.587 | ||

No. of experiments | 6 | 13 | 6 | ||||

No. of rows data | 46 | 438 | 71 |

Dataset type . | Parameters . | Kothyari (1989) . | Chang et al. (2004) . | Oliveto & Hager (2005) . | |||
---|---|---|---|---|---|---|---|

Minimum . | Maximum . | Minimum . | Maximum . | Minimum . | Maximum . | ||

Train | 0.332 | 0.948 | 1.496 | 3.012 | 0.023 | 1.705 | |

158.537 | 414.634 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 2.117 | 4.426 | 0.059 | 2.617 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.473 | 4.599 | −0.002 | 4.532 | 2.937 | 4.138 | ||

2.638 | 4.705 | −0.145 | 4.257 | 3.274 | 4.341 | ||

2.265 | 4.473 | 0.03 | 4.432 | 2.335 | 4.017 | ||

0.076 | 0.99 | 0 | 1.41 | 0 | 0.587 | ||

No. of experiments | 3 | 7 | 3 | ||||

No. of rows data | 24 | 233 | 33 | ||||

Serial no. in dataset | 267 | 290 | 34 | 266 | 1 | 33 | |

Test | 0.332 | 1.038 | 1.496 | 3.014 | 0.006 | 1.567 | |

91.549 | 280.488 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 1.597 | 3.314 | 0.042 | 2.342 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.634 | 4.768 | −0.002 | 4.532 | 2.741 | 4.054 | ||

2.792 | 4.709 | −0.145 | 4.257 | 3.032 | 4.942 | ||

2.482 | 4.725 | 0.03 | 4.432 | 2.289 | 3.946 | ||

0.102 | 0.866 | 0 | 1.42 | 0 | 0.557 | ||

No. of experiments | 3 | 6 | 3 | ||||

No. of rows data | 22 | 205 | 38 | ||||

Serial no. in dataset | 244 | 265 | 39 | 243 | 1 | 38 | |

Overall | 0.332 | 1.038 | 1.496 | 3.014 | 0.006 | 1.705 | |

91.549 | 414.634 | 140.845 | 140.845 | 35.484 | 35.484 | ||

2.48 | 3.488 | 1.597 | 4.426 | 0.042 | 2.617 | ||

1.4 | 1.4 | 1.2 | 3 | 2.15 | 2.15 | ||

2.473 | 4.768 | −0.002 | 4.532 | 2.741 | 4.138 | ||

2.638 | 4.709 | −0.145 | 4.257 | 3.032 | 4.942 | ||

2.265 | 4.725 | 0.03 | 4.432 | 2.289 | 4.017 | ||

0.076 | 0.99 | 0 | 1.42 | 0 | 0.587 | ||

No. of experiments | 6 | 13 | 6 | ||||

No. of rows data | 46 | 438 | 71 |

### Input combination

The choice of input parameters has a major influence on model efficacy. In the application of the Buckingham pi theorem, five separate dimensionless parameters were obtained from the effective variables. Various combinations of input parameters were investigated to enhance comprehension of how particular dimensionless factors affect the effectiveness of the model. *R*-value (Table 3) was examined to identify the impact of the input parameter over the output parameter. Table 4 shows different combinations of different input parameters based on the *R*-value. Input no. 1 is with (i.e., highest *R*-value), Input no. 2 (log(*T _{n}*), ) added (i.e., second highest

*R*-value) in the previous combination. In the last Input no. 5 (, , ) added (i.e., lowest

*R*-value). This is the most popular method for identifying the most effective input parameters.

. | . | . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|

d/_{st}D_{p} | 0.605 | 0.464 | 0.538 | −0.469 | 0.452 | 0.429 | 0.176 | 1 |

. | . | . | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|---|

d/_{st}D_{p} | 0.605 | 0.464 | 0.538 | −0.469 | 0.452 | 0.429 | 0.176 | 1 |

In statistical analysis, the correlation coefficient measures the strength and direction of the linear relationship between two variables. When combining inputs in a model, it is important to consider the correlation between the input and output parameters to avoid redundancy and overfitting. If two variables are highly correlated, meaning they have a correlation coefficient value close to 1 or −1, including both in the model leads to multi-collinearity issues. On the other hand, if two variables have a low correlation coefficient value, they might provide complementary information, and including both in the model could improve its accuracy. Therefore, the correlation coefficient value is used as a guide to decide whether to include or exclude variables when combining inputs in a model.

The input variables with the highest impact on the time-dependent scour depth are listed in Table 3. The values of the Pearson correlation coefficient revealed that the logarithm of the dimensionless time, *T _{n}*, had the highest impact (

*R*= 0.605 for

*T*=

_{n}*T*

_{3}), followed by the sediment gradation (

*R*= −0.469), the densimetric Froude number (

*R*= 0.452), the relative flow depth (

*R*= 0.429), and the inverse of the relative grain size (

*R*= 0.176).

### Theoretical background

#### Extreme gradient boosting regressor (XGBR)

*n*number of feature datasets:where is the number of additive functions of training represented by

*K*, the number of tree leaves is represented by

*T*, tree structure by

*q*, and weight by

*w*. The trees are enhanced by the additive training method. In this method, one tree is added at a time. XGBR parameters such as the number of trees, depth, and iterations have the potential to result in overfitting. XGBR used L1 and L2 regularization methods to penalize the methods to avoid the overfitting problem. In every iteration, we aim to minimize the regularized objective function that is shown in Equation (8):where loss function of and is calculated by loss function . Loss function penalizes the complexity of methods by function . In particular:where the coefficients

*α*and

*β*are used to control the fitting of the method.

#### Random forest regressor (RFR)

*Y*,

*X*, one can assume the training part is used individually. Any numerical estimator

*h*(

*x*) will have the following mean squared error (MSE):

*K*of the trees {} to form a random forest predictor, one can write:where represents the decision tree, where is learning data selected before the decision tree evaluates. It is a uniform independent distribution vector. All the decision trees are accommodated and averaged so that an ensemble of decision trees of

*h*(

*x*) is constructed using Equation (11).

#### Extra trees regressor (ETR)

Geurts *et al.* (2006) proposed the ETR algorithm that works on the RF method. Creating many unpruned decision trees from the training dataset follows ensemble learning methods to evaluate regression and classification problems. For prediction of classification, an ETR method first finds out the prominent appearance of the prediction result of the used decision tree, while in the case of regression, it takes an average of the result of the used decision tree.

ETR has two significant differences from RFR. First, in the RFR, the whole training dataset is split into a subset by bootstrap sample, while ETR utilizes the entire training dataset to implement each decision tree in the ensemble method. Second, it randomly selects the best feature with the corresponding values to split tree nodes instead of choosing the best feature from the assigned feature. One needs two setups: the number of randomly selected features at each decision point and the minimum required sizes to split decision points.

### Model evaluation criteria

Once the most suitable input combinations have been identified, each method will train and test with the same input combination. Some plots will be employed to visually inspect and evaluate the method's efficacy. The Taylor diagram has the special advantage of incorporating the two most prevalent correlation statistics (Taylor 2001). Scatters are shown in the graphic to show the statistical value of various proposed methods. The measured data point is shown as a base point for statistics comparison. The scatter point of the forecast value is more closely aligned with the base point, representing the more robust prediction efficacy. A box plot is useful because it shows the underestimate, correct, and overestimate efficacy of a model.

*R*), Nash–Sutcliffe efficiency (NSE), agreement index (AI), mean absolute error (MAE), MSE, centered root mean squared error (CRMSE), and scatter index (SI) are used for model evaluation. Based on these metrics, the best model was selected. The mathematical representation of the model evaluation criteria is given below:

In the above expressions, and are the actual and forecast time-dependent scour depth value, and are the mean of these values.

## RESULTS

### Best input combination

As the range of multiple parameters has a massive variance in minimum and maximum values, we have used the min–max scalar technique to normalize the values and a box plot to visualize them.

*et al.*2019; Kumar

*et al.*2022, 2023; Pattanaik

*et al.*2022). The

*R*-value for each dimensional input has already been shown in Table 3. However, a visual representation of the

*R*-values with respect to

*d*/

_{st}*D*

_{p}is provided in the spider plot shown in Figure 5. We assumed that the highest

*R*-value has more impact on forecasting time-dependent scour depth, so we selected each input parameter according to the highest to lowest

*R*-value. In this manner, we got five input combinations that are tabulated in Table 4. From the above input sequence, three input models were derived according to the three different time scales log(

*T*

_{1}), log(

*T*

_{2}), and log(

*T*

_{3}) (Table 5).

Models # . | Functional relationship . |
---|---|

Model 1 | |

Model 2 | |

Model 3 |

Models # . | Functional relationship . |
---|---|

Model 1 | |

Model 2 | |

Model 3 |

### Data-driven methods efficacy

The three data-driven methods XGBR, RFR, and ETR were considered to estimate the time-dependent scour depth at piers. In order to evaluate their efficacy, seven statistical metrics were applied. Namely, *R*^{2}, *R*, NSE, AI, CRMSE, and SI previously defined. Table 6 shows the statistical results at the testing stage. The effectiveness of the proposed methods, particularly when Model 3 is considered, is notably favorable during the testing stage, marking it as the most promising result. In order to better highlight the best data-driven method when Model 3 is considered, Table 7 shows the related efficacies indices.

Model . | XGBR . | RFR . | ETR . | ||||||
---|---|---|---|---|---|---|---|---|---|

Model 1 . | Model 2 . | Model 3 . | Model 1 . | Model 2 . | Model 3 . | Model 1 . | Model 2 . | Model 3 . | |

R^{2} rank | 0.907 (3) | 0.913 (2) | 0.932 (1) | 0.898 (2) | 0.893 (3) | 0.917 (1) | 0.899 (2) | 0.895 (3) | 0.906 (1) |

R rank | 0.952 (3) | 0.955 (2) | 0.966 (1) | 0.947 (2) | 0.945 (3) | 0.958 (1) | 0.948 (2) | 0.946 (3) | 0.952 (1) |

NSE rank | 0.906 (3) | 0.912 (2) | 0.932 (1) | 0.893 (2) | 0.889 (3) | 0.913 (1) | 0.897 (2) | 0.892 (3) | 0.905 (1) |

AI rank | 0.974 (3) | 0.976 (2) | 0.982 (1) | 0.969 (2) | 0.968 (3) | 0.976 (1) | 0.972 (2) | 0.970 (3) | 0.974 (1) |

CRMSE rank | 0.111 (3) | 0.107 (2) | 0.095 (1) | 0.118 (2) | 0.12 (3) | 0.106 (1) | 0.116 (2) | 0.118 (3) | 0.111 (1) |

SI rank | 0.245 (3) | 0.237 (2) | 0.209 (1) | 0.261 (2) | 0.267 (3) | 0.236 (1) | 0.256 (2) | 0.262 (3) | 0.247 (1) |

Average rank | (3) | (2) | (1) | (2) | (3) | (1) | (2) | (3) | (1) |

Model . | XGBR . | RFR . | ETR . | ||||||
---|---|---|---|---|---|---|---|---|---|

Model 1 . | Model 2 . | Model 3 . | Model 1 . | Model 2 . | Model 3 . | Model 1 . | Model 2 . | Model 3 . | |

R^{2} rank | 0.907 (3) | 0.913 (2) | 0.932 (1) | 0.898 (2) | 0.893 (3) | 0.917 (1) | 0.899 (2) | 0.895 (3) | 0.906 (1) |

R rank | 0.952 (3) | 0.955 (2) | 0.966 (1) | 0.947 (2) | 0.945 (3) | 0.958 (1) | 0.948 (2) | 0.946 (3) | 0.952 (1) |

NSE rank | 0.906 (3) | 0.912 (2) | 0.932 (1) | 0.893 (2) | 0.889 (3) | 0.913 (1) | 0.897 (2) | 0.892 (3) | 0.905 (1) |

AI rank | 0.974 (3) | 0.976 (2) | 0.982 (1) | 0.969 (2) | 0.968 (3) | 0.976 (1) | 0.972 (2) | 0.970 (3) | 0.974 (1) |

CRMSE rank | 0.111 (3) | 0.107 (2) | 0.095 (1) | 0.118 (2) | 0.12 (3) | 0.106 (1) | 0.116 (2) | 0.118 (3) | 0.111 (1) |

SI rank | 0.245 (3) | 0.237 (2) | 0.209 (1) | 0.261 (2) | 0.267 (3) | 0.236 (1) | 0.256 (2) | 0.262 (3) | 0.247 (1) |

Average rank | (3) | (2) | (1) | (2) | (3) | (1) | (2) | (3) | (1) |

Rank (1) was assigned for the best result and (3) for the worst one.

*R*

^{2}are also provided. Additional ML approaches, including GBR, M5P, WIHW-KStar, BA-KStar (Algorithms of Bagging coupled with KStar), ABR, SVR, RS-KStar, GPR, and some empirical models from literature, including Johnson (1992) and Oliveto & Hager (2002), are also considered. It can be noted that the

*R*

^{2}values of the proposed methods (for Model 3) XGBR, RFR, and ETR are 0.93, 0.92, and 0.91, respectively, which are greater than

*R*

^{2}values of existing methods like GBR, M5P, WIHW-KStar, BA-KStar, ABR, SVR, RS-KStar, and GPR (0.91, 0.9, 0.84, 0.84, 0.83, 0.82, 0.75, and 0.73, respectively) and, as expected, greater than those for the empirical equations by Oliveto & Hager (2002), Johnson (1992), Ettema

*et al.*(2011), and Richardson & Davis (2001) (equal to 0.62, 0.35, 0.34, and 0.15, respectively) indicating advanced efficacies by the data-driven models considered in this study. The XGBR method with Model 3 exhibited the highest

*R*

^{2}value (i.e., 0.93).

In summary, based on the model evaluation criteria, the XGBR method showed superior efficacy compared with the RFR and ETR methods. More specifically, the efficacy of XGBR exceeded the efficacy of RFR, which in turn was better than ETR. The ensemble ML methods that utilize weak learners as base regressors demonstrate enhanced performance and become strong learners when their results are combined. The efficiency of these ensemble ML methods exceeds that of standalone ML methods, which are used as base regressors in ensemble ML methods. Thus, for practical purposes, it is recommended that the XGBR method be used in the prediction of time-dependent scour depth at piers.

In terms of qualitative analysis, we can see that Model 1, Model 2, and Model 3 only differ in the time scales (*T*_{1}, *T*_{2}, and *T*_{3}, respectively). As we need to predict the time-dependent scour depth, it is evident that at time *T*_{3}, the equilibrium scour depth is attained, although the graph shown in Figure 2 depicts that, after time scale *T*_{2}, the scour depth increases gradually until time *T*_{3}. Nevertheless, there is a gradual increase in the scour depth until time *T*_{3}. Thus, after time scales *T*_{1} and *T*_{2}, there is still a scope of scouring left. Hence, for accurate determination of scour depth, we need to have the time scale *T*_{3}. The same can be seen in all the prediction models: Model 3, which actually contains *T*_{3}, provides the best results for all ML algorithms. The time scale *T*_{3} basically signifies the equilibrium scour depth that is attained, and no more scouring occurs. As a result, it provides accurate scouring predictions.

### Sensitivity analysis

*r*

_{c}) was calculated for the dimensionless parameters to assess their respective influences on the dimensionless time-dependent scour depths Z =

*d*/

_{st}*D*

_{p}. The relevancy coefficient is given by:

Here, *X* and *Z* represent the matrices of the training and target datasets, respectively. The *r*_{c} for all the dimensionless parameters are provided in Table 3. The analysis reveals that log(*T*_{3}) has the highest relevancy coefficient value among all input parameters.

*R*, MAE, and MSE as shown in Table 9. On removal of , we get the lowest

*R-*value and the highest MSE ( = 0.751, = 0.057), so can be considered the most sensitive dimensionless parameter in the prediction of time-dependent scour depth (). In addition, the standard deviation of the distribution of sediment size is the second effective parameter in the prediction of time-dependent scour depths ( = 0.763, = 0.056). Figure 10 displays the comparison of the

*R*-values for all sensitivity analysis states for the training and testing stages.

Iteration . | Input parameters . | Removed parameter . | Training . | Testing . | ||||
---|---|---|---|---|---|---|---|---|

R
. | MAE . | MSE . | R
. | MAE . | MSE . | |||

A | 0.808 | 0.140 | 0.048 | 0.751 | 0.173 | 0.057 | ||

B | 0.834 | 0.140 | 0.043 | 0.763 | 0.166 | 0.056 | ||

C | 0.999 | 0.005 | 0.000 | 0.960 | 0.069 | 0.010 | ||

D | 0.999 | 0.005 | 0.000 | 0.956 | 0.075 | 0.011 | ||

E | 0.999 | 0.005 | 0.000 | 0.963 | 0.068 | 0.009 | ||

All | – | 0.999 | 0.005 | 0.000 | 0.966 | 0.064 | 0.009 |

Iteration . | Input parameters . | Removed parameter . | Training . | Testing . | ||||
---|---|---|---|---|---|---|---|---|

R
. | MAE . | MSE . | R
. | MAE . | MSE . | |||

A | 0.808 | 0.140 | 0.048 | 0.751 | 0.173 | 0.057 | ||

B | 0.834 | 0.140 | 0.043 | 0.763 | 0.166 | 0.056 | ||

C | 0.999 | 0.005 | 0.000 | 0.960 | 0.069 | 0.010 | ||

D | 0.999 | 0.005 | 0.000 | 0.956 | 0.075 | 0.011 | ||

E | 0.999 | 0.005 | 0.000 | 0.963 | 0.068 | 0.009 | ||

All | – | 0.999 | 0.005 | 0.000 | 0.966 | 0.064 | 0.009 |

Bold indicates the **worst** efficacy.

In another method of sensitivity analysis, we add Gaussian noise in independent parameters one by one and then predict the time-dependent scour depth using this set of features. The concept of Gaussian noise is foundational to both ML and signal processing. In this case, it means a type of random change that fits a Gaussian distribution. We add Gaussian noise to the data to achieve the degree of unpredictability specified by this particular distribution. This may significantly affect the ML method's analysis and performance. In our method, we sequentially add 10 and 50% Gaussian noise to each column (parameter). Using this method causes a certain amount of disruption to the data, which can help us test how well ML methods can handle new data.

When analyzing how Gaussian noise affects ML technique performance, it is common to evaluate the ML method's ability to process noisy inputs and see if it can continue to produce reliable predictions in spite of the increased unpredictability. This procedure requires an understanding of the ML method's robustness to noisy input and its generalization beyond clean training data. By purposefully adding Gaussian noise in a methodical way, we are able to track changes in the ML method's behavior, such as variances in different performance indicators.

In sensitivity analysis, Gaussian noise is crucial because it lets us examine how changes in input parameters impact a model's or system's output. Random fluctuations are introduced into these parameters via Gaussian noise, which has a normal distribution. This replicates measurement errors or uncertainty present in the actual data. By methodically changing input parameters within a certain range and seeing the ensuing changes in output, analysts can ascertain the model's sensitivity to each parameter. This study aids in identifying the components that most affect the behavior of the model. Researchers can also test the model's resilience by subjecting it to various degrees of Gaussian noise. This allows them to determine whether the model consistently produces accurate results, even in the face of input variability.

These discoveries provide important insights into how the ML approach responds to varying noise levels and can aid in devising strategies to improve the system's robustness or more accurately tune its parameters. Moreover, studying Gaussian noise in the context of ML issues improves understanding of overfitting, ML method generalization, and trade-offs between resilience and complexity. Table 10 presents the results of our performance analysis with Gaussian noise augmentation.

Iteration . | Noise added in . | 10% add noise . | 50% add noise . | ||||
---|---|---|---|---|---|---|---|

R
. | MAE . | MSE . | R
. | MAE . | MSE . | ||

A | 0.938 | 0.091 | 0.016 | 0.826 | 0.146 | 0.042 | |

B | 0.961 | 0.071 | 0.010 | 0.866 | 0.123 | 0.034 | |

C | 0.961 | 0.071 | 0.010 | 0.958 | 0.073 | 0.011 | |

D | 0.955 | 0.076 | 0.012 | 0.956 | 0.077 | 0.011 | |

E | 0.962 | 0.072 | 0.010 | 0.960 | 0.069 | 0.010 | |

All | – | 0.966 | 0.064 | 0.009 | 0.966 | 0.064 | 0.009 |

Iteration . | Noise added in . | 10% add noise . | 50% add noise . | ||||
---|---|---|---|---|---|---|---|

R
. | MAE . | MSE . | R
. | MAE . | MSE . | ||

A | 0.938 | 0.091 | 0.016 | 0.826 | 0.146 | 0.042 | |

B | 0.961 | 0.071 | 0.010 | 0.866 | 0.123 | 0.034 | |

C | 0.961 | 0.071 | 0.010 | 0.958 | 0.073 | 0.011 | |

D | 0.955 | 0.076 | 0.012 | 0.956 | 0.077 | 0.011 | |

E | 0.962 | 0.072 | 0.010 | 0.960 | 0.069 | 0.010 | |

All | – | 0.966 | 0.064 | 0.009 | 0.966 | 0.064 | 0.009 |

Bold indicates the worst efficacy.

## DISCUSSION

The scouring of bridges occurs due to the erosive impact of flowing water, which erodes and carries away debris from the stream bed, banks, and the vicinity of bridge piers and abutments. The intricacies of the flow mechanism around a pier structure are so intricate that devising a universally applicable empirical model for an accurate estimation of scour proves to be a challenging task. In the literature, several empirical formulae based on classical regression analysis have been proposed to calculate the scour depth value and its time variation. Empirical formulas show that scour depth depends upon pier diameter, approach flow velocity, flow depth, median particle size, sediment gradation, time, among others. Empirical formulas produce good results for specific data, but their efficacy degrades in a different environment. Compared with ML methods, these empirical equations do not perform as well (Najafzadeh *et al.* 2018b). However, few studies have examined ensemble ML methods to forecast time-dependent scour depth. The objective of this study was to assess and compare the accuracy of novel ensemble methods against established hybrid methods, standalone methods, and empirical equations.

One of the most important aspects of creating a reliable ML method is choosing an appropriate input variable combination. The *R*-value of input and output variables can be used to find an optimal input combination (Pattanaik *et al.* 2022; Shakya *et al.* 2022a, 2022b). There are other available methods for determining the best input combination, but in this study, the sequence of *R*-value appeared suitable for identifying the most precise performing input combination. Moreover, a sensitivity analysis was performed to explore the impact of each dimensionless parameter on the time-dependent scour depth computation in order to find the most sensitive parameters. According to the findings, the log of the dimensionless time and the sediment gradation had the largest influence.

All the proposed data-driven methods were found to provide satisfactory prediction results. The XGBR has the advantage of being an ensemble. Thereby its structure is able to combine the weak learners into a strong one, thereby providing higher accuracy. Due to their increased flexibility over standalone methods and the usage of weak learners, ensemble methods demonstrated higher prediction efficacy than standalone methods. The non-linearity of the relationships between the variables was better captured by the ensemble methods as can be seen in this study. The comparison with the existing ML approach shows that standalone ML methods are not able to handle the non-linearity well, despite the fact that they can perform better than the empirical equations proposed in the literature. The present GPR technique had the poorest efficacy since it makes predictions using data from all samples and feature sets. In large dimensional areas, the GPR approach becomes less effective, especially when there are more than a few dozen features.

The efficacy disparity between the methods is due to their differing computing architectures. For three reasons, the XGBR method produced the most precise predictions. First, the XGBR method enhances the number of rounds by cross-validation or a percentage split strategy. Second, it exploits the power of parallel processing making its computation extremely fast. Third, the XGBR structure benefits from ensemble learning (many base regressors), which is better performing as compared with a single regressor (Dietterich 1997). The ensemble technique helps in the reduction of variance and the avoidance of overfitting induced by the use of a bootstrap method. From the scatter plots, ensemble learning methods outperform the empirical equations as well as standalone ML methods in terms of *R*^{2}.

Overall, this study showed the great potential of the XGBR method with Model 3 to generate a reliable prediction of the time-dependent scour depth at piers. This method has the advantage of requiring only five dimensionless variables: log(*T*_{3}), *σ*_{g}, F_{d}, *y*/*D*_{p}, and *D*_{p}/*d*_{50}. Therefore, engineers can use our proposed ML method to accurately predict the local scour depths while constructing bridges, weirs, spur dikes, and cofferdams.

More research needs to be done on how well these methods can predict the depth of scour in situations that are more difficult than the datasets that were used for this study. For example, live-bed scour, channels that are not straight lines, flows that are not steady, and water-worked beds that look more like the surface topography of natural coarse-grained rivers should be looked at. Future research should examine how well these methods perform in forecasting the depth of scour in situations that are more difficult than the datasets that were used for this study. For example, different types of water-worked beds, channel shapes, flows, and bed scour.

## LIMITATIONS OF THE CURRENT STUDY

The forecast of time-dependent scour depth is a challenging phenomenon due to the large number of elements involved. In this study, we forecast the time-dependent local scour at piers using: the mean sediment grain size (*d*_{50}), sediment gradation (*σ*_{g}), average approach flow velocity (*U*), approach flow depth (*y*), pier diameter (*D*_{p}), and time (*t*). Apart from these, factors such as pier aspect ratio, soil conditions, and skew angle of the pier are likely to influence time-dependent scour depth forecasts. As our dataset lacked these characteristics, our proposed models did not account for their impact. Furthermore, the parameter range is critical in ML model training. While our dataset is compiled from various experiments, there are instances where the ranges of input parameters may surpass the values considered. During such occurrences, the performance of the proposed model might be compromised. It is worth noting that these concerns are common in many ML-based models, as the characteristics of the dataset heavily influence their training.

## CONCLUSIONS

Accurate prediction of time-dependent scour depth is critical for preventing in-channel structural collapse. Because of the scour problem's non-linear nature, ensemble ML methods have a high potential for producing precise predictions of time-dependent scour depth at piers. This study tested the potential of ensemble ML methods utilizing pre-existing time-dependent scour depth data collected in the laboratory. The proposed scheme includes three ensemble methods, XGBR, RFR, and ETR, and eight existing methods, GBR, M5P, WIHW-KStar, BA-KStar, ABR, SVR, RS-KStar, and GPR, as well as four empirical equations proposed by Oliveto & Hager (2002), Johnson (1992), Ettema *et al.* (2011), and Richardson & Davis (2001) for comparison purposes.

Here are the main outcomes:

(1) The XGBR with Model 3 is the most accurate method among all the proposed ones.

(2) The ensemble methods had the better prediction efficacy, followed by standalone ML methods and empirical equations.

(3) The XGBR method had the better prediction efficacy to forecast time-dependent scour depth at piers, followed by: RFR, ETR = GBR, M5P, WIHW-KStar, BA-KStar, ABR, SVR, RS-KStar, GPR, and empirical equations proposed by Oliveto & Hager (2002), Johnson (1992), Ettema

*et al.*(2011), and Richardson & Davis (2001).(4) All methods underestimate the maximum time-dependent scour depth. Only RFR could predict the maximum time-dependent scour depth with some accuracy, although it still needed improvement. However, most of the methods could predict the minimum time-dependent scour depth, which is encouraging.

(5) Sensitivity analysis for the best method (XGBR method and Model 3) found that log(

*T*_{3}) is the most sensitive parameter.

The results showed that the ensemble ML methods are superior to empirical equations and traditional ML methods for forecasting time-dependent scour depth at piers. For accuracy and convenience, the XGBR method is the best of all the proposed ensemble methods. As a result, engineers building in-channel structures may find this approach helpful in estimating the maximum scour depth.

## CREDIT AUTHORSHIP CONTRIBUTION STATEMENT

S.K.: Conceptualization, Methodology, Writing – original manuscript. G.O.: Data Processing, Reviewing and Editing. V.D.: Conceptualization, Reviewing. M.A.: Resources, Supervision, Modeling, Writing – review and editing. U.R.: Reviewing and Revising.

## FUNDING

This research received no external funding.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Evaluating Scour at Bridges*, 5th edn. Vol. Hydraulic Engineering Circular No. 18. U.S. Department of Transportation, Federal Highway Administration, Washington, D.C

*Scour Around Bridge Piers*

*Ph.D. thesis*

*Evaluating Scour at Bridges*, 4th edn. Vol. Hydraulic Engineering Circular No. 18. U.S. Department of Transportation, Federal Highway Administration, Washington, D.C.

**34**, 15481–15497. doi:10.1007/s00521-022-07237-x