## Abstract

Scouring refers to the process by which bed sediment in a river is eroded around the periphery of a bridge abutment or pier. Many empirical models are available to estimate the scour depth for different flow, geometry, and bed roughness condition. However, none of them provide a better estimation of scour depth for a wide range of input parameters. Thus, in this paper, the scour depth around bridge piers has been modelled using M5 tree and hybrid artificial neural network (ANN)-particle swarm optimisation (PSO) techniques by considering the wide range of datasets. The clear-water scouring (CWS) datasets are collected from the literature and five different non-dimensional influencing parameters are selected as input parameters to model the scour depth. A Gamma test (GT) was performed to choose the best input parameter combinations. Based on the lowest gamma value and *V*-ratio, 4 out of 26 distinct input combinations for CWS depth modelling were chosen in the GT. According to statistical measures, the proposed M5 tree model predicts scour depth better than empirical approaches. Additionally, the developed ANN-PSO model is suitable for determining scour depth in both rectangular and circular shapes of piers. The results of both developed models are compared with other existing models and found to be satisfactory.

## HIGHLIGHTS

A wide dataset range is considered in developing a model for scour depth around the bridge pier.

Influencing parameters affecting the scour depth are identified using the Gamma test.

Separate models have been proposed for clear-water scouring around bridge piers using the M5 tree and ANN-PSO.

## NOMENCLATURE

*b*Pier diameter

*d*_{50}Median diameter of sediment

*F*_{r}Froude number

*F*_{d50}Densiometric Froude number

*F*_{rc}Critical Froude number

*g*Acceleration due to gravity

*U*Approach mean velocity

*U*_{c}Critical velocity

*y*Flow depth

*σ*_{g}Geometric standard deviation of bed sediment size

*ν*Kinematic viscosity of water

*ρ*density of water

*E*Nash–Sutcliffe efficiency

*I*_{d}Index of agreement

*N*Number of neuron

*R*^{2}Coefficient of determination

- ANN
Artificial neural network

- CWS
Clear-water scouring

- GT
Gamma test

- PSO
Particle swarm optimisation

- RMSE
Root-mean-square error

- SDR
Scour depth ratio

## INTRODUCTION

*in situ*. However, during the event, river current is unpredictable, and changes in discharge can occur quickly. Because the maximum flood discharge does not last long enough to completely scour, the maximum flood effluent's anticipated scour depth using steady flow relations is very deep. As a result, hydraulic foundation design that is based on maximum scour depth estimations for design discharge may be inaccurate. Figure 1 depicts the scour depth process that creates scour holes and wake vortices downstream of the bridge pier.

*et al.*(1992) investigated the temporal change of scouring in steady and unstable clear-water flows. However, scouring depth around a bridge pier is a key theme in hydraulic engineering. Scouring causes many bridge failures each year. Bridge collapses can be caused by congestion, crashes, or inadequate maintenance, but one of the most common causes is scouring. Researchers have looked for measures to protect the bridge from erosion around the piers. The scour depth must be predicted for a bridge to remain stable. However, sinking the bridge pier deeply into the bed is an expensive proposition. Scour protection systems act by reducing the strength of both the down-flow and horseshoe vortex, increasing the ability of the bed-shear stress to endure flow, preventing the vortex from settling on an impenetrable hard surface, and preventing the vortex from developing in size. Bridge pier scour must be precisely estimated to ensure the safety and effectiveness of bridges across natural rivers. A variety of factors influence bridge pier scour, including channel and bridge design, watershed characteristics, flow hydraulics, bed components, channel protection devices, channel stability, debris, and others. There are three types of scours at a bridge, according to Richardson & Davis (2001): aggradation or degradation, contraction scour, and local scour. The removal of bed material from surrounding piers, abutments, spurs, and embankments is known as local scour. Local scour is caused by the creation of horseshoe vortices at the base of piers. The horseshoe vortex forms due to the accumulation of water on the upstream surface of the obstacle, which increases shear stress and, as a result, the flow transports the sediment. The increase in flow sediment transport capacity removes the bed material near the base of the pier. The bed material close to the pier base is removed by enhancing the flow ability to transport sediment. As a result, a scour hole has formed around the bridge pier because more material is being removed from the base region. The flow, pier geometry, and river bed material feature largely control the temporal fluctuation of scour and the maximum depth of scour at the bridge pier. Calculating the local scour equilibrium depths around bridge piers accurately is critical for bridge hydraulic design. Figure 2 depicts the collapse of a bridge pier in Taiwan in 2008 as a result of local scouring. The majority of methods for estimating local scour at bridge piers rely on dimensional analysis in small-scale laboratory tests with non-cohesive, uniform bed material under steady flow cases (McIntosh 1989; Mueller & Wagner 2005). A considerable number of laboratory equations are readily available in the literature (Mueller & Wagner 2005).

In recent years, many artificial intelligence (AI) techniques have been studied in river sedimentation. Sun *et al.* (2021) developed a new hybrid model of support vector regression (SVR) and fruitfly optimisation algorithm for the prediction of scour geometry at ski-jump spillway. Sharafati *et al.* (2021) evaluated the performance of the sediment ejector efficiency using hybrid models and found that artificial neural network (ANN)-particle swarm optimisation (PSO) model has more potential to predict better removal efficiency in sediment dischargers. Tao *et al.* (2021) reviewed and discussed various AI modelling tools to solve the transportation of river sediment problem and discussed its merits and demerits for different flow conditions. For field conditions, laboratory research equations do not usually yield reliable results (Jones 1984; Baranwal *et al.* 2023a, 2023b). Most studies are conducted with uniform flow, steady flow, and non-cohesive bed materials due to the scale effect, causing laboratory settings to oversimplify or neglect the intricacies of natural rivers. As a result, laboratory flume-based scour depth models overestimate the scour depth reported at bridge piers (Mueller & Wagner 2005). ANNs and other soft computing approaches have been utilised for estimating the scour depth of bridge piers. Several empirical equations have also been used to evaluate the efficacy of these methods. A majority of studies have shown that the neural network technique outperforms empirical relationships. However, for various civil engineering problems, the M5 model tree (MT) approach has been found to work as well as or better than neural network approaches (Bhattacharya & Solomatine 2003; Solomatine & Siek 2004; Pal & Deswal 2009). Najafzadeh & Giuseppe (2021) developed non-linear regression equations using evolutionary polynomial regression (EPR), gene-expression programming (GEP), multivariate adaptive regression spline (MARS), and M5 MT approaches and found that the M5 tree provided better results.

In the present study, a CWS depth prediction model is developed using M5 tree and ANN-PSO techniques. While developing the model, a wide range of datasets, both experimental and field data, were considered. A Gamma test (GT) was used to determine the optimum input combinations.

## RESEARCH BACKGROUND

Scour around bridge piers was first identified by Kothyari *et al.* (1992), and subsequent research by Dey *et al.* (1995) confirmed these findings. Using a cylinder-shaped pier installed in regular beds under clear water flows and determining the depth of the pier scour using a sediment transport equation, Mia & Nago (2003) were able to predict the local scour depth with time, obtaining equilibrium local scour depth when the bed-shear stress approached critical bed-shear stress in the scour hole. Lee & Sturm (2009) experimented to determine the effect of relative sediment size on pier scour depth using three uniform sediment sizes and three bridge pier designs at different geometric model scales. The effect of relative sediment size on dimensionless pier scour depth was analysed by filtering the data with a Froude number criterion. Das *et al.* (2013) utilised an ultrasonic experimental investigation Doppler velocimeter to focus on the equilibrium scour hole at a circular bridge pier. The horseshoe vortex is an important phenomenon to understand for researchers studying scour around circular piers. The circulation around a pier is affected by several factors such as the densiometric Froude number, flow shallowness, and pier Reynolds number. An increase in densiometric Froude number leads to an increase in circulation, whereas an increase in flow shallowness leads to a decrease in circulation. Similarly, a constant densiometric Froude number or constant flow shallowness leads to an increase in circulation with the pier Reynolds number. To learn more about how scour depths change over time near circular piers, scientists have undertaken experimental experiments. To propose a new empirical relationship including flow intensity, relative water depth, and dimensionless time, Aksoy *et al.* (2017) compared experimental data with those obtained from empirical equations from the literature. Using formulations based on the MT, EPR, and GEP, Najafzadeh *et al.* (2018) described the current conceptual relationships for determining the local scour depth in rectangular channels. The MT method was found to be superior to GEP, EPR, and conventional equations for predicting scour depth. The scour depth was shown to be larger in circular piers than in rectangular piers in trials conducted by Chavan *et al.* (2018). A greater wake vortex was also seen downstream of circular piers compared to rectangular piers. According to Pandey *et al.* (2020), the creation of an armour layer in a non-uniform gravel bed condition is what causes the scour hole. Overall, laboratory and field investigations by many researchers have shown that the shape of the pier, flow depth, approach flow velocity, and bed sediment size have a major impact on the scour depth. Hamidifar *et al.* (2021) investigated and verified different equations to estimate the critical flow velocity ratio which was further used in estimating scour depth. They proposed better combinations of scour depth predictive models for clear-water scouring (CWS) conditions. The ability of various data-driven models (DDMs) to predict pier scour depth was evaluated by Qaderi *et al.* (2021). These DDMs included the support vector machine (SVM), the adaptive neuro-fuzzy inference system (ANFIS), the ANN, the GEP, the improved group method of data handling (GMDH1 and GMDH2), and two hybrid forms of Group method of data handling (GMDH) combined with harmony search (HS). By combining the prediction abilities of three well-known machine learning (ML) methods namely, Gaussian process regression (GPR), random forest (RF), and M5 model tree (M5Tree) with a novel least-squares (LS) boosting ensemble committee-based data intelligent technique. Garg *et al.* (2022) performed experiments around a bridge pier in a tandem arrangement for clear-water conditions and developed the scour depth model using multivariable regression analysis. Nil & Das (2022, 2023) developed the scour predictive model using the SVM technique considering the geometry, flow, and bed sediment parameters for clear water and live-bed scouring conditions.

### Existing scour depth predictive equations for CWS condition

*et al.*(1969) used the geometrical parameter (

*b*/

*y*), velocity, and kinematic viscosity () in their scour depth predictive equation (Equation (3)) whereas Coleman (1971) used the geometrical parameter (

*b*/

*y*), and Froude number (Fr) as input parameter in clear-water scour depth equation (Equation (4)). Hancu (1971) used the critical Froude number (

*F*

_{rc}) in the proposed Equation (5), where , where

*U*

_{c}is the critical velocity,

*b*is the pier width and

*g*is the acceleration due to gravity. Colorado State University (1975) developed Equation (6) for scour depth which is based on a laboratory dataset. Kim

*et al.*(2015) and Pandey

*et al.*(2018) used the mean bed sediment size (

*d*

_{50}) as one of the input parameters in predicting scour depth in Equations (7) and (8), respectively.where

## METHODOLOGY

### Identification of influencing parameters for scour depth modelling

*et al.*(2015); Pandey

*et al.*(2018) describe a functional relationship that characterises scour depth.where

*ρ*is the fluid density,

*μ*is the fluid dynamic viscosity,

*U*is the approach mean velocity,

*U*

_{c}is the critical velocity. From the previous works of literature (Chabert & Engeldinger 1956; Shen

*et al.*1969; Hancu 1971; Jain & Fischer 1979; Melville & Chiew 1999; Mia & Nago 2003; Sheppard

*et al.*2004; Mueller & Wagner 2005; Raikar & Dey 2005; Ettema

*et al.*2006; Lee & Sturm 2009; Lança

*et al.*2013; Aksoy

*et al.*2017; Khan

*et al.*2017; Ebrahimi

*et al.*2018; Garg

*et al.*2022), data are collected to develop a better scour depth model. The most influencing non-dimensional parameters that significantly affect the scour depth ratio (

*d*

_{s}/

*y*) are as follows:

### Collection of datasets from previous literature

The maximum scour depth to flow depth ratio was chosen as an output parameter during the development of this model, and five non-dimensional characteristics were chosen as input parameters. In the current investigation, 942 datasets for clear-water bridge pier scouring have been collected from the previously published literature. 75% (708 data) and 25% (234 data) of the datasets are used for training and testing, respectively. The details of datasets with non-dimensional parameter ranges collected from previous work of CWS have been presented in Table 1.

S. No. . | Author . | b/y
. | U/U_{c}
. | Fr . | d_{50}/b (10^{2})
. | σ
. |
---|---|---|---|---|---|---|

1 | Chabert & Engeldinger (1956) | 0.142–1.5 | 0.98–1.22 | 0.24–0.54 | 0.3–6 | 1.24–2.57 |

2 | Shen et al. (1969) | 0.57–1.49 | 1.03–1.43 | 0.21–0.32 | 0.5–3.2 | 1.08–1.16 |

3 | Jain & Fischer (1979) | 0.21 | 1.19 | 0.526 | 4.9 | 1.51 |

4 | Melville & Chiew (1999) | 0.05–5.08 | 0.69–1.00 | 0.12–0.88 | 1.3–7.6 | 1.01–3.54 |

5 | Mia & Nago (2003) | 0.14–1.50 | 0.70–0.90 | 0.12–0.41 | 0.2–6.0 | 1.12–1.58 |

6 | Sheppard et al. (2004) | 0.08–5.35 | 0.75–1.21 | 0.07–0.38 | 0.2–7.2 | 1.21–1.51 |

7 | Mueller & Wagner (2005) | 0.04–10.0 | 0.14–8.11 | 0.03–0.82 | 0.2–0.8 | 1.19–19.29 |

8 | Raikar & Dey (2005) | 0.13–0.31 | 0.95 | 0.48–0.71 | 5.3–44.5 | 1.08–1.16 |

9 | Ettema et al. (2006) | 0.064–0.41 | 0.80 | 0.15 | 0.2–1.0 | 1.30 |

10 | Lee & Sturm (2009) | 0.13–0.32 | 0.57–1.07 | 0.14–0.61 | 0.9–1.2 | 1.01–3.64 |

11 | Das et al. (2013) | 0.63–1.67 | 0.58–0.82 | 0.22–0.41 | 0.7–1.6 | 1.80 |

12 | Lança et al. (2013) | 0.5–5.00 | 0.28–0.35 | 0.26–0.32 | 0.2–1.7 | 1.36 |

13 | Aksoy et al. (2017) | 0.21–1.07 | 0.45–0.56 | 0.26–0.33 | 0.1–2.6 | 1.39 |

14 | Khan et al. (2017) | 0.28–3.33 | 0.16–2.08 | 0.05–1.38 | 0.7–1.8 | 1.12–2.84 |

15 | Ebrahimi et al. (2018) | 0.38–0.62 | 0.48–0.85 | 0.21–0.44 | 2.7 | 1.01–3.4 |

16 | Farooq & Ghumman (2020) | 0.22–0.41 | 0.66–0.74 | 0.21–0.22 | 0.7–1.9 | 1.17–1.2 |

17 | Pandey et al. (2020) | 0.47–1.35 | 0.03–0.08 | 0.71–0.98 | 1.51–1.88 | 0.43–0.71 |

18 | Garg et al. (2022) | 0.36–0.39 | 0.9 | 0.16–0.17 | 0.4 | 2.51 |

S. No. . | Author . | b/y
. | U/U_{c}
. | Fr . | d_{50}/b (10^{2})
. | σ
. |
---|---|---|---|---|---|---|

1 | Chabert & Engeldinger (1956) | 0.142–1.5 | 0.98–1.22 | 0.24–0.54 | 0.3–6 | 1.24–2.57 |

2 | Shen et al. (1969) | 0.57–1.49 | 1.03–1.43 | 0.21–0.32 | 0.5–3.2 | 1.08–1.16 |

3 | Jain & Fischer (1979) | 0.21 | 1.19 | 0.526 | 4.9 | 1.51 |

4 | Melville & Chiew (1999) | 0.05–5.08 | 0.69–1.00 | 0.12–0.88 | 1.3–7.6 | 1.01–3.54 |

5 | Mia & Nago (2003) | 0.14–1.50 | 0.70–0.90 | 0.12–0.41 | 0.2–6.0 | 1.12–1.58 |

6 | Sheppard et al. (2004) | 0.08–5.35 | 0.75–1.21 | 0.07–0.38 | 0.2–7.2 | 1.21–1.51 |

7 | Mueller & Wagner (2005) | 0.04–10.0 | 0.14–8.11 | 0.03–0.82 | 0.2–0.8 | 1.19–19.29 |

8 | Raikar & Dey (2005) | 0.13–0.31 | 0.95 | 0.48–0.71 | 5.3–44.5 | 1.08–1.16 |

9 | Ettema et al. (2006) | 0.064–0.41 | 0.80 | 0.15 | 0.2–1.0 | 1.30 |

10 | Lee & Sturm (2009) | 0.13–0.32 | 0.57–1.07 | 0.14–0.61 | 0.9–1.2 | 1.01–3.64 |

11 | Das et al. (2013) | 0.63–1.67 | 0.58–0.82 | 0.22–0.41 | 0.7–1.6 | 1.80 |

12 | Lança et al. (2013) | 0.5–5.00 | 0.28–0.35 | 0.26–0.32 | 0.2–1.7 | 1.36 |

13 | Aksoy et al. (2017) | 0.21–1.07 | 0.45–0.56 | 0.26–0.33 | 0.1–2.6 | 1.39 |

14 | Khan et al. (2017) | 0.28–3.33 | 0.16–2.08 | 0.05–1.38 | 0.7–1.8 | 1.12–2.84 |

15 | Ebrahimi et al. (2018) | 0.38–0.62 | 0.48–0.85 | 0.21–0.44 | 2.7 | 1.01–3.4 |

16 | Farooq & Ghumman (2020) | 0.22–0.41 | 0.66–0.74 | 0.21–0.22 | 0.7–1.9 | 1.17–1.2 |

17 | Pandey et al. (2020) | 0.47–1.35 | 0.03–0.08 | 0.71–0.98 | 1.51–1.88 | 0.43–0.71 |

18 | Garg et al. (2022) | 0.36–0.39 | 0.9 | 0.16–0.17 | 0.4 | 2.51 |

### Model development

To choose the combination of best input parameters for modelling the scour depth in the M5 tree and ANN-PSO techniques, the GT was performed.

### Gamma test

*V*-ratio, which restores a scaled invariant objection evaluation between 0 and 1 (Agalbjorn

*et al.*1997; Das

*et al.*2021; Choudhary

*et al.*2023). During the GT, a total of 26 experiments were performed, and the top 10 experiments are shown in Table 2. The GT is used to find the four best combinations of input parameters. The

*V*-ratio is defined as follows:where

*σ*

^{2}(

*y*) is the variance of yield (

*y*). When comparing yields from different sources, the

*V*-ratio is a useful figure to examine because it is not sensitive to large differences in yields. The real yield (as estimated by a smooth model) is stable when the

*V*-ratio approaches zero. In terms of the smooth model, the yield is like the irregular disturbance when the

*V*-ratio is close to one. Using the Win-Gamma programme (Durrant 2001), the GT can be obtained during the procedure. According to the authors, this method is highly effective and may be applied to other non-linear modelling problems involving hydraulics. The

*V*-ratio is a statistic for gauging how well easily available data may predict a given yield. The input dataset with the lowest

*Γ*and

*V*-ratio values is thought to be best for modelling. For CWS four combinations are- (a) all input, (b) all input without

*σ*, (c) all input without Fr, and (d) all input without Fr and

*σ*.

S. No. . | Combination of input parameters . | Γ . | Std. error . | V-ratio
. | Mask . |
---|---|---|---|---|---|

1 | b/y, F_{r}, d_{50}/b | 0.0391 | 0.0036 | 0.1563 | 10110 |

2 | b/y, U/U_{c}, F_{r} | 0.0368 | 0.0044 | 0.1473 | 11100 |

3 | U/U_{c}, F_{r}, d_{50}/b | 0.0364 | 0.0074 | 0.1457 | 01111 |

4 | b/y, F_{r}, d_{50}/b, σ | 0.0324 | 0.0047 | 0.1296 | 10111 |

5 | b/y, U/U_{c}, σ | 0.0313 | 0.0040 | 0.1253 | 11001 |

6 | b/y, U/U_{c,}F_{r}, σ | 0.0299 | 0.0037 | 0.1198 | 11101 |

7 | b/ y, U/U_{c,}F_{r}, d_{50}/b | 0.0257 | 0.0047 | 0.1031 | 11110 |

8 | b/ y, U/U_{c,}F_{r}, d_{50}/b, σ | 0.0226 | 0.0046 | 0.0906 | 11111 |

9 | b/ y, U/U_{c,}d_{50}/b, σ | 0.0186 | 0.0034 | 0.0747 | 11011 |

10 | b/ y, U/U_{c,}d_{50}/b | 0.0167 | 0.8555 | 0.0045 | 11010 |

S. No. . | Combination of input parameters . | Γ . | Std. error . | V-ratio
. | Mask . |
---|---|---|---|---|---|

1 | b/y, F_{r}, d_{50}/b | 0.0391 | 0.0036 | 0.1563 | 10110 |

2 | b/y, U/U_{c}, F_{r} | 0.0368 | 0.0044 | 0.1473 | 11100 |

3 | U/U_{c}, F_{r}, d_{50}/b | 0.0364 | 0.0074 | 0.1457 | 01111 |

4 | b/y, F_{r}, d_{50}/b, σ | 0.0324 | 0.0047 | 0.1296 | 10111 |

5 | b/y, U/U_{c}, σ | 0.0313 | 0.0040 | 0.1253 | 11001 |

6 | b/y, U/U_{c,}F_{r}, σ | 0.0299 | 0.0037 | 0.1198 | 11101 |

7 | b/ y, U/U_{c,}F_{r}, d_{50}/b | 0.0257 | 0.0047 | 0.1031 | 11110 |

8 | b/ y, U/U_{c,}F_{r}, d_{50}/b, σ | 0.0226 | 0.0046 | 0.0906 | 11111 |

9 | b/ y, U/U_{c,}d_{50}/b, σ | 0.0186 | 0.0034 | 0.0747 | 11011 |

10 | b/ y, U/U_{c,}d_{50}/b | 0.0167 | 0.8555 | 0.0045 | 11010 |

The significance of bold is best input parameters combination based on low gamma value and v ratio and these combinations are used for modelling.

#### Statistical indices

*P*is the predicted value,

_{i}*O*is the observed value, and

_{i}*n*is the number of samples. The RMSE is the square root of the mean absolute error (

*d*

_{s}/

*y*) estimated by the M5 tree model. The

*R*

^{2}statistic measures the extent to which changes in one variable can be explained by changes in another when making predictions about the outcome of an event. and the equation for

*R*

^{2}is given as:

### M5 tree

*T*represents a set of examples that reach the node;

*T*represents the subset of examples that have the

_{i}*i*th outcome of the potential set. M5 considers all possible splits and picks the one with the highest predicted error reduction. Overfitting is a possibility in the huge tree-like structure that is often the consequence of this partitioning. The tree should be cut back, or a subtree should be swapped out for a leaf, to prevent overfitting. Therefore, in the second stage of MT design, linear regression functions are typically substituted for the subtrees and the tree is pruned to remove excess branches. By dividing the parameter space into subspaces, a linear regression model is built for each section of the tree. After considering all possible tests, M5 selects the one with the highest expected error reduction. The main improvements of the M5 tree come in after the primary tree is developed. Estimating the predictive power of a model with new data is a common task for the M5 tree. First, we define the model residual as the absolute deviation from the case target value minus the model prediction. The M5 tree estimates the inaccuracy of a model by calculating its average residual over a collection of training instances. Since this tends to underestimate the error in novel situations, the M5 tree boosts the value by (

*n*), where

*n*is the number of training examples and is the number of model parameters. The is to increase the estimated error of models with many parameters constructed from small numbers of cases. Using common regression techniques, a multivariate linear model is built for the cases at each node of the MT. However, this model does not use all attributes; rather, it only uses those that appear in tests or linear models at some point in the subtree at this node. By comparing the accuracy of a subtree with that of a linear model, M5 ensures equal opportunities where the two types of models utilise the same data. Each obtained linear model is then reduced in complexity by eliminating parameters to lower the predicted error. Although the average residual rises when parameters are eliminated, this also lowers the multiplicative factor, which lowers the estimated error. M5 uses a rigorous search to remove irrelevant variables from the model, and it can sometimes remove all variables and leave only a constant. To determine which simplified linear model or model subtree is the best fit for each non-leaf node in the MT, M5 starts towards the bottom and works its way up. The subtree at this node gets compressed into a leaf if the linear model is utilised. It has been discovered that smoothing helps tree-based models perform better when making predictions. To assess a case's worth, an MT considers the anticipated values at each node along the path from the root to the leaf in the issue and then modifies the value provided by the model at the proper leaf. Results for M5 and its competitors in a variety of learning activities are available for comparison. To compare the efficiency of various approaches, one can calculate the relative error, which is the variance of the residuals divided by the variance of the target values. Another helpful statistic is the correlation between observed and predicted values (even if a correlation coefficient only suggests a linear link between the two). Learning model trees from huge datasets are as efficient as learning regression trees. The capacity of model trees to take advantage of local linearity in the data allows them to outperform regression trees in terms of both compactness and prediction accuracy. Another important difference is that model trees can extrapolate, while regression trees will never predict a value outside the range observed in the training cases.

### ANN-PSO

ANN, or networks of artificial neurons, are a type of computational model used in data processing. These methods are used to model non-linear relationships between input and output in statistical data. Without the requirement for specific physics, an ANN can be considered theoretically as a universal approximation that learns from examples. Neural networks are computational models that can easily process data in parallel. The basic building blocks of an ANN are a network of simple processing units (neurons) that exchange information via analogue signals. These impulses are transmitted along weighted connections between neurons. In this layer, neurons process data from the input layer. Connectivity between neurons in neighbouring layers is mediated by weights. For a network to learn, the training patterns' relative weights at each stage must be adjusted. PSO was utilised by Kennedy & Eberhard (1995) to optimise the methods with the population in mind. The flocking habits of birds and the swimming patterns of schooling fish are two examples of social phenomena that have an effect. The algorithm begins with a population of particles, each of which has a position, speed, and fitness value. The swarm or particle as a whole uses individual knowledge and memory to determine the best course of action. During particle movement, each particle chooses the position that will benefit it the most based on its own and the experiences of its neighbours. The social nature of fish schools and bird flocks served as inspiration for the creation of PSO in 1995, a population-based technique used for process optimisation. The particles' fitness values, positions, and velocities are initialised in the PSO algorithm. Each member of the swarm or particle contributes their unique knowledge and memory to the problem-solving process. Each optimal location of a particle is established by its interactions with its neighbours and other particles. When conventional methods have failed, ANN is called upon to find a solution. One of the most useful kinds of neural networks, ANN can handle both non-linear functions and basic structures. ANNs are built from a collection of simple, parallel components. The neural network of the human brain served as a model for these parts. An ANN is built from layers of cells (neurons) connected by weighted communication lines, and it has three main layers: input, hidden, and output. As signals, these data are transmitted through these lines and into the input network. After that, the activation function of each cell is applied to the input signal to generate an output signal, which is then multiplied by the strength of the connections between any two cells. Optimisation methods known as PSOs work by sampling from an extensive collection of possible starting points. The PSO algorithm was developed by Kennedy & Eberhart (1995), Kennedy *et al.* (2001), and Kennedy (2010), who were themselves motivated by the social behaviours of animals such as flocks of birds and schools of fish. Many similarities could be drawn between this method and genetic algorithms and other forms of evolutionary computing. A PSO differs from a genetic algorithm in that it does not make use of fast-converging evolutionary operations like inversion and crossover. PSOs perform in many applications, such as neural network training, function optimisation, and even fuzzy control systems. Each ‘particle’ in a PSO stands for a possible answer to the optimisation problem. The fitness function of the problem quantifies the merit of each existing solution. The best local position in terms of fitness is the target of this method. Particle motion in the direction of the ideal solution and at the optimal speed is determined by this fitness value. The system generates a random population of solutions and iteratively shifts these solutions through the search space until it reaches the optimal one. The best position each particle has achieved so far is called its ‘local best,’ whereas the best position overall is the one that is the average of all the local bests inertia where rand1 and rand2 are random numbers between 0 and 1; C1 and C2 are the individual and social learning rates, which are typically equal to 2, and *w* is the inertia weight, which is used to strike a balance between the particles' capacity to explore both the global and local environments. Using the same evolutionary framework as algorithms like Genetic Algorithm (GA) and PSO, the algorithm seeks an optimal solution.

## RESULTS AND DISCUSSIONS

### Clear-water scour depth modelling using M5 tree

*et al.*1995), Equation (17) has been utilised to bring the input and yield data to the domain of (0.05, 0.95).where

*a*

_{norm}is the normalised input,

*a*is the original input,

*a*

_{min}is the minimum of input range,

*a*

_{max}is the maximum of input range

*R*

^{2}value is 0.834 for the overall dataset, showing a better predictive model as compared to other M5 tree models. Additionally, a scatter plot of the observed vs. predicted SDR of the newly developed M5 tree model is presented in Figure 3(a) to check the variation of scour depth ratio around the best-fit line. Figure 3(a) shows that the M5 model is overpredicting the scour depth (SDR) ratio (ds/y) for some data in the range of 0.25–1.5 and underpredicting SDR for some data in the range of 1.5–3.7.

Minimum instances . | Number of rules . | Training data . | Testing data . | All data . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | ||

100 | 8 | 0.847 | 0.872 | 0.963 | 0.289 | 0.806 | 0.851 | 0.953 | 0.388 | 0.834 | 0.865 | 0.961 | 0.316 |

150 | 8 | 0.831 | 0.864 | 0.961 | 0.312 | 0.821 | 0.853 | 0.953 | 0.394 | 0.837 | 0.854 | 0.963 | 0.334 |

200 | 1 | 0.721 | 0.743 | 0.921 | 0.364 | 0.634 | 0.733 | 0.912 | 0.523 | 0.667 | 0.743 | 0.921 | 0.434 |

250 | 1 | 0.654 | 0.712 | 0.913 | 0.432 | 0.635 | 0.724 | 0.897 | 0.534 | 0.645 | 0.716 | 0.913 | 0.467 |

300 | 1 | 0.657 | 0.716 | 0.914 | 0.435 | 0.636 | 0.723 | 0.897 | 0.535 | 0.647 | 0.715 | 0.914 | 0.463 |

350 | 1 | 0.654 | 0.712 | 0.913 | 0.435 | 0.637 | 0.729 | 0.897 | 0.535 | 0.645 | 0.714 | 0.915 | 0.468 |

400 | 2 | 0.674 | 0.735 | 0.917 | 0.425 | 0.656 | 0.742 | 0.913 | 0.525 | 0.675 | 0.737 | 0.917 | 0.448 |

Minimum instances . | Number of rules . | Training data . | Testing data . | All data . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | ||

100 | 8 | 0.847 | 0.872 | 0.963 | 0.289 | 0.806 | 0.851 | 0.953 | 0.388 | 0.834 | 0.865 | 0.961 | 0.316 |

150 | 8 | 0.831 | 0.864 | 0.961 | 0.312 | 0.821 | 0.853 | 0.953 | 0.394 | 0.837 | 0.854 | 0.963 | 0.334 |

200 | 1 | 0.721 | 0.743 | 0.921 | 0.364 | 0.634 | 0.733 | 0.912 | 0.523 | 0.667 | 0.743 | 0.921 | 0.434 |

250 | 1 | 0.654 | 0.712 | 0.913 | 0.432 | 0.635 | 0.724 | 0.897 | 0.534 | 0.645 | 0.716 | 0.913 | 0.467 |

300 | 1 | 0.657 | 0.716 | 0.914 | 0.435 | 0.636 | 0.723 | 0.897 | 0.535 | 0.647 | 0.715 | 0.914 | 0.463 |

350 | 1 | 0.654 | 0.712 | 0.913 | 0.435 | 0.637 | 0.729 | 0.897 | 0.535 | 0.645 | 0.714 | 0.915 | 0.468 |

400 | 2 | 0.674 | 0.735 | 0.917 | 0.425 | 0.656 | 0.742 | 0.913 | 0.525 | 0.675 | 0.737 | 0.917 | 0.448 |

Many models are tried using different features. The bold signifies the best M5 tree model among all different M5 tree predictive models.

### Clear-water scour depth modelling using ANN-PSO

In ANN-PSO modelling, the coefficients *C*_{1} and *C*_{2} were set to 1.5 and 2.5, respectively, after performing several trials. To assess the strength of different models and evaluate the performance of existing models, two statistical indices, RMSE and coefficient of determination (*R*^{2}), as well as two relative indices, *E* and *I*_{d}, were used. Table 4 shows the results of error analysis for the training data, testing data, and the entire dataset for various swarm sizes and numbers of neurons (N).

N . | Swarm Size . | Training data . | Testing data . | All data . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | ||

5 | 10 | 0.824 | 0.845 | 0.963 | 0.325 | 0.778 | 0.835 | 0.947 | 0.415 | 0.816 | 0.825 | 0.957 | 0.328 |

5 | 20 | 0.844 | 0.871 | 0.963 | 0.291 | 0.789 | 0.846 | 0.954 | 0.394 | 0.829 | 0.863 | 0.961 | 0.318 |

5 | 30 | 0.841 | 0.867 | 0.965 | 0.294 | 0.787 | 0.845 | 0.956 | 0.407 | 0.821 | 0.861 | 0.961 | 0.335 |

5 | 40 | 0.835 | 0.862 | 0.961 | 0.329 | 0.783 | 0.843 | 0.952 | 0.404 | 0.817 | 0.853 | 0.956 | 0.337 |

5 | 50 | 0.832 | 0.865 | 0.945 | 0.291 | 0.775 | 0.832 | 0.941 | 0.393 | 0.813 | 0.861 | 0.954 | 0.327 |

8 | 10 | 0.815 | 0.844 | 0.953 | 0.323 | 0.765 | 0.823 | 0.932 | 0.423 | 0.798 | 0.845 | 0.956 | 0.355 |

8 | 20 | 0.823 | 0.855 | 0.924 | 0.323 | 0.771 | 0.834 | 0.945 | 0.414 | 0.743 | 0.792 | 0.943 | 0.398 |

8 | 30 | 0.835 | 0.867 | 0.965 | 0.385 | 0.766 | 0.835 | 0.957 | 0.413 | 0.816 | 0.842 | 0.953 | 0.348 |

8 | 40 | 0.816 | 0.837 | 0.954 | 0.335 | 0.745 | 0.813 | 0.947 | 0.445 | 0.783 | 0.837 | 0.952 | 0.367 |

8 | 50 | 0.815 | 0.844 | 0.952 | 0.322 | 0.767 | 0.823 | 0.951 | 0.421 | 0.795 | 0.835 | 0.951 | 0.356 |

10 | 10 | 0.833 | 0.862 | 0.961 | 0.297 | 0.751 | 0.815 | 0.945 | 0.397 | 0.813 | 0.842 | 0.934 | 0.335 |

10 | 20 | 0.813 | 0.832 | 0.952 | 0.323 | 0.772 | 0.826 | 0.943 | 0.424 | 0.796 | 0.845 | 0.952 | 0.356 |

N . | Swarm Size . | Training data . | Testing data . | All data . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | R^{2}
. | E
. | I_{d}
. | RMSE . | ||

5 | 10 | 0.824 | 0.845 | 0.963 | 0.325 | 0.778 | 0.835 | 0.947 | 0.415 | 0.816 | 0.825 | 0.957 | 0.328 |

5 | 20 | 0.844 | 0.871 | 0.963 | 0.291 | 0.789 | 0.846 | 0.954 | 0.394 | 0.829 | 0.863 | 0.961 | 0.318 |

5 | 30 | 0.841 | 0.867 | 0.965 | 0.294 | 0.787 | 0.845 | 0.956 | 0.407 | 0.821 | 0.861 | 0.961 | 0.335 |

5 | 40 | 0.835 | 0.862 | 0.961 | 0.329 | 0.783 | 0.843 | 0.952 | 0.404 | 0.817 | 0.853 | 0.956 | 0.337 |

5 | 50 | 0.832 | 0.865 | 0.945 | 0.291 | 0.775 | 0.832 | 0.941 | 0.393 | 0.813 | 0.861 | 0.954 | 0.327 |

8 | 10 | 0.815 | 0.844 | 0.953 | 0.323 | 0.765 | 0.823 | 0.932 | 0.423 | 0.798 | 0.845 | 0.956 | 0.355 |

8 | 20 | 0.823 | 0.855 | 0.924 | 0.323 | 0.771 | 0.834 | 0.945 | 0.414 | 0.743 | 0.792 | 0.943 | 0.398 |

8 | 30 | 0.835 | 0.867 | 0.965 | 0.385 | 0.766 | 0.835 | 0.957 | 0.413 | 0.816 | 0.842 | 0.953 | 0.348 |

8 | 40 | 0.816 | 0.837 | 0.954 | 0.335 | 0.745 | 0.813 | 0.947 | 0.445 | 0.783 | 0.837 | 0.952 | 0.367 |

8 | 50 | 0.815 | 0.844 | 0.952 | 0.322 | 0.767 | 0.823 | 0.951 | 0.421 | 0.795 | 0.835 | 0.951 | 0.356 |

10 | 10 | 0.833 | 0.862 | 0.961 | 0.297 | 0.751 | 0.815 | 0.945 | 0.397 | 0.813 | 0.842 | 0.934 | 0.335 |

10 | 20 | 0.813 | 0.832 | 0.952 | 0.323 | 0.772 | 0.826 | 0.943 | 0.424 | 0.796 | 0.845 | 0.952 | 0.356 |

Many models are tried using different features. The bold signifies the best ANN-PSO model among all different ANN-PSO predictive models.

In Table 4, for *N* = 5 and swarm size = 20, the values of *R*^{2}, *E*, *I*_{d}, and RMSE were found to be 0.84, 0.87, 0.96, and 0.29, respectively, indicating a superior predictive model for scour depth when using the ANN-PSO approach. It was observed that, as swarm sizes increased with the same values of *C*_{1} and *C*_{2} while maintaining the number of neurons constant, the values of *R*^{2}, *E*, and *I*_{d} decreased, while the value of RMSE increased. Figure 3(b) shows that the predicted scour depth values from the newly developed ANN-PSO model closely match the observed values within the dataset ranges, as indicated by the proximity of the data points to the best-fit line. Figure 3(b) shows that the ANN-PSO model is overpredicting the scour depth (SDR) ratio (ds/y) for some data in the range of 0.5–1.5 and underpredicting SDR for some data in the range of 1.3–2.5.

### Sensitivity analysis for CWS using M5 tree

To determine the effectiveness of the current model with a minimum of 100 instances, a sensitivity analysis was performed using the M5 model tree. Table 5 presents the results in terms of four statistical parameters: *R*^{2}, RMSE, *E*, and *I*_{d}. Model 5, which uses all four parameters except *σ* as inputs, shows high accuracy with *R*^{2}, RMSE, *E*, and *I*_{d} values of 0.84, 0.86, 0.95, and 0.42, respectively. However, the current model with all five non-dimensional parameters as input combinations shows decreasing values of RMSE and *R*^{2} as the number of instances increases, resulting in increased error. From Table 5, it can be observed that model 6 has the lowest *R*^{2} value of 0.64, while other models have *R*^{2} values close to or greater than 0.8. This indicates that b/y and Fr have a significant influence on the estimation of scour depth ratio.

Sensitivity analysis . | R^{2}
. | E
. | I_{d}
. | RMSE . |
---|---|---|---|---|

Model 1: ds/y=f (Fr, d _{50}/y, b/y, U/U_{c}, σ) | 0.79 | 0.84 | 0.95 | 0.46 |

Model 2: ds/y=f (Fr, b/y, U/U _{c}, σ) | 0.79 | 0.86 | 0.96 | 0.47 |

Model 3: ds/y=f (d _{50}/y, b/y, U/U_{c}, σ) | 0.78 | 0.86 | 0.95 | 0.47 |

Model 4: d_{s}/y=f (Fr, d _{50}/y, b/y, σ) | 0.79 | 0.85 | 0.95 | 0.46 |

Model 5: d_{s}/y=f (Fr, d_{50}/y, b/y, U/U_{c}) | 0.84 | 0.86 | 0.95 | 0.42 |

Model 6: d_{s}/y=f (Fr, d_{50}/y, U/U_{c}. σ) | 0.64 | 0.74 | 0.82 | 0.59 |

Sensitivity analysis . | R^{2}
. | E
. | I_{d}
. | RMSE . |
---|---|---|---|---|

Model 1: ds/y=f (Fr, d _{50}/y, b/y, U/U_{c}, σ) | 0.79 | 0.84 | 0.95 | 0.46 |

Model 2: ds/y=f (Fr, b/y, U/U _{c}, σ) | 0.79 | 0.86 | 0.96 | 0.47 |

Model 3: ds/y=f (d _{50}/y, b/y, U/U_{c}, σ) | 0.78 | 0.86 | 0.95 | 0.47 |

Model 4: d_{s}/y=f (Fr, d _{50}/y, b/y, σ) | 0.79 | 0.85 | 0.95 | 0.46 |

Model 5: d_{s}/y=f (Fr, d_{50}/y, b/y, U/U_{c}) | 0.84 | 0.86 | 0.95 | 0.42 |

Model 6: d_{s}/y=f (Fr, d_{50}/y, U/U_{c}. σ) | 0.64 | 0.74 | 0.82 | 0.59 |

Based on sensitivity analysis model 5 and model 6 are the most sensitive models among all six M5 tree models.

### Sensitivity analysis for CWS using ANN-PSO

In the ANN-PSO model, the input parameters *b*/*y* and Fr are important for modelling, as the first one represents geometric features and the second one represents flowing behaviour around the bridge pier. As the pier diameter increases, scouring also increases because the width of obstruction increases. A Froude number (Fr) less than 1.0 represents subcritical flow, while a Froude number greater than 1.0 represents supercritical flow. Therefore, these flow patterns significantly affect the scouring process. Table 6 illustrates this relationship in terms of four statistical parameters: *R*^{2}, RMSE, *E*, and *I*_{d}, for the M5 model tree. The values of *R*^{2}, RMSE, *E*, and *I*_{d} are found to be 0.80, 0.33, 0.84, and 0.95 respectively. Model 5, with all four parameters excluding *σ* as input combinations in a model, shows high accuracy for swarm size = 20, *n* = 5. In the absence of *b*/*y*, Model 6 produces poor results, and Model 4 produces a lower value of *R*^{2} in the absence of Fr.

Sensitivity analysis . | R^{2}
. | E
. | I_{d}
. | RMSE . |
---|---|---|---|---|

Model 1: d_{s}/y=f (Fr, d _{50}/y, b/y, U/Uc, σ) | 0.77 | 0.81 | 0.94 | 0.37 |

Model 2: d_{s}/y=f (Fr, b/y, U/Uc, σ) | 0.80 | 0.84 | 0.95 | 0.34 |

Model 3: d_{s}/y=f (d _{50}/y, b/y, U/Uc, σ) | 0.79 | 0.83 | 0.95 | 0.34 |

Model 4: d_{s}/y=f (Fr, d _{50}/y, b/y, σ) | 0.73 | 0.85 | 0.95 | 0.39 |

Model 5: d_{s}/y=f (Fr, d_{50}/y, b/y, U/U_{c}) | 0.80 | 0.84 | 0.95 | 0.33 |

Model 6: d_{s}/y=f (Fr, d_{50}/y, U/U_{c,} σ) | 0.58 | 0.66 | 0.88 | 0.50 |

Sensitivity analysis . | R^{2}
. | E
. | I_{d}
. | RMSE . |
---|---|---|---|---|

Model 1: d_{s}/y=f (Fr, d _{50}/y, b/y, U/Uc, σ) | 0.77 | 0.81 | 0.94 | 0.37 |

Model 2: d_{s}/y=f (Fr, b/y, U/Uc, σ) | 0.80 | 0.84 | 0.95 | 0.34 |

Model 3: d_{s}/y=f (d _{50}/y, b/y, U/Uc, σ) | 0.79 | 0.83 | 0.95 | 0.34 |

Model 4: d_{s}/y=f (Fr, d _{50}/y, b/y, σ) | 0.73 | 0.85 | 0.95 | 0.39 |

Model 5: d_{s}/y=f (Fr, d_{50}/y, b/y, U/U_{c}) | 0.80 | 0.84 | 0.95 | 0.33 |

Model 6: d_{s}/y=f (Fr, d_{50}/y, U/U_{c,} σ) | 0.58 | 0.66 | 0.88 | 0.50 |

Based on sensitivity analysis model 5 and model 6 are the most sensitive models among all six ANN-PSO models.

### Comparison of the presently developed model with the existing scour depth model

*et al.*(1969), Hancu (1971), Coleman (1971), CSU (1975), Kim

*et al.*(2015), and Pandey

*et al.*(2018) have been utilised to calculate scour depth for the collected dataset. The observed scour depth ratio versus predicted scour depth from these scour depth predictive equations are shown in Figure 4(a)–4(h).

*et al.*'s (1969) model predicted very poorly the SDR value from 0.05 to 3.8. Hancu's (1971) empirical model provides a better prediction of SDR in the range of 0.5–2.5 as shown in Figure 4(d). From Figure 4(e) and 4(f), it can be observed that the Coleman (1971), and CSU (1975) empirical models overpredicted the SDR for a range of 0.4–3.7, and 0.3–3.5 for some datasets, respectively. Kim

*et al.*(2015) used standard deviation (

*σ*) and Frounde number (

*F*

_{r}) and mean particle size (

*d*

_{50}) to estimate the scour depth and predicted the better SDR value for the range of 0.05–1.0 as shown in Figure 4(g). Pandey

*et al.*'s (2018) empirical model provides a better SDR value for the range of 0.10–3.0 due to the consideration of mean bed sediment size (

*d*

_{50}) and densiometric Froude number (

*F*

_{d50}). For a wide range of datasets, as is considered in the present study, no existing models provide a better predictive value of scour depth ratio. It is observed from Figure 4(a)–4(h) that the

*R*

^{2}value of any existing scour depth model is less than 0.65 for the collected dataset indicating the poor prediction capability of these models for a wide range of datasets used in the present research. A comparison plot of SDR values obtained from present models (i.e., M5 tree and ANN-PSO) and eight scour predictive models are presented in Figure 5. It can be seen in Figure 5 that the SDR values predicted from the developed model lie close to the best-fit line as compared to other scour depth predictive models indicating the suitability of the present model for this wide range of datasets in CWS condition. It is also observed that the M5 tree model results were found to be better than the developed ANN-PSO as most of the SDR value obtained from the M5 tree model lies close to the best-fit line. For practical application, the present M5 tree model can be used for the dataset collected within the range mentioned in Table 1.

*R*

^{2}, and

*I*

_{d}) was performed. For this, four different researchers' datasets – Melville & Chiew (1999), Molinas (2004), Mueller & Wagner (2005), Lança

*et al.*(2013), and Khan

*et al.*(2017) – were taken into consideration. The present models and eight scour depth predictive Equations (3)–(12) were used to calculate the error for the aforementioned researchers' datasets, and the error bar diagram is depicted in Figure 6(a)–6(c). Figure 6(a) shows the RMSE plot, in which the M5 tree model shows the least error compared to ANN-PSO and other existing models. Shen

*et al.*’s (1969) model shows a high RMSE for the collected dataset, indicating the lower predictive ability of the model. This may be because only the Reynolds number parameter is used in the equation in terms of velocity, flow depth, and kinematic viscosity of water. For Melville & Chiew's (1999) dataset, all the models provide lower RMSE values except Shen

*et al.*(1969), Hancu (1971), and Kim

*et al.*(2015). In Figure 6(b), it is found that the M5 tree model shows a high value of

*R*

^{2}more than 0.80 for all four researcher datasets. For Mueller & Wagner's (2005) dataset, the present model M5 tree and ANN-PSO provided high

*R*

^{2}values close to 0.90 and 0.70, while other models provide an

*R*

^{2}value of less than 0.55. Kim

*et al.*'s (2015) model provides poor results for the Lança

*et al.*(2013) dataset but provides an

*R*

^{2}value of more than 0.80 for Khan

*et al.*'s (2017) dataset. Shen

*et al.*’s (1969) equation provides the least

*R*

^{2}value for Mueller & Wagner's (2005) dataset. Figure 6(c) depicts the Index of Agreement values for different researcher datasets. It is clear in Figure 6(c) that all the scour depth predictive models, including the two present models, show high

*I*

_{d}values for Melville & Chiew's (1999), Lança

*et al.*'s (2013), and Khan

*et al.*'s (2017) datasets. For Mueller & Wagner's (2005) dataset, the present model M5 tree shows a better

*I*

_{d}value of more than 0.90 while other models provide an

*I*

_{d}value of less than 0.7. Overall, the present model M5 tree shows a lower error value and high

*R*

^{2}value for all the datasets taken into consideration.

## CONCLUSIONS

The current research utilised the M5 tree and ANN-PSO, two soft computing models, to create a scour depth model for CWS situations near bridge piers. The investigation uncovered the subsequent findings:

The GT revealed that the diameter of the pier, the depth of the flow, the Froude number, the approach flow velocity, the critical flow velocity, the geometric standard deviation of the bed sediment, and the mean sediment size were the most important elements for forecasting scour depth at the pier.

Using a minimum of 100 instances in M5 tree modelling yielded better results than alternative combinations of instances for the given datasets for predicting clear-water scour depth. The RMSE for predicting scour depth was 0.318, and the coefficient of determination was 0.82, using the ANN-PSO model with five neurons and a swarm size of 20. In contrast to the ANN-PSO model, the M5 tree model is proven to have superior overall performance. M5 tree modelling reduced the number of instances used during modelling and provided a higher value of

*R*^{2}producing better results.Sensitivity analysis revealed that

*b*/*y*and Fr were critical in determining to scour depth as they handle the geometric shape of the pier and the flow pattern around it, respectively.Compared to existing scour depth predictive models of different researchers, the present M5 model tree for CWS conditions shows good accuracy with observed scour depth with an

*R*^{2}value of 0.836 for the present dataset.The limitation of the present study is the variety of datasets employed in the modelling of CWS. If the input parameter value falls between the ranges listed in Table 1 for CWS then the model will produce better results. To incorporate the effect of bridge pier shape, different flow conditions and increase model accuracy, more CWS datasets need to be considered. Furthermore, modelling of scour depth for live-bed scouring and temporal scouring conditions will help the researcher and river engineer to efficiently design the bridge pier to avoid its failure.

## DISCLOSURE STATEMENT

The authors disclosed no potential conflicts of interest.

## DATA AVAILABILITY STATEMENT

Data are available in the supplementary file.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Société hydrotechnique de France*

*Encyclopaedia of Machine Learning*(C. Sammut & G. I. Webb, eds.).

*In Proceedings of the Bridge Scour Symposium*, McLean, VA, Federal Highway Administration Research Report FHWA-RD-90-035, 78-100

*Report FHWA–RD-03-083*.

*Field Observations and Evaluations of Streambed Scour at Bridges*(No. FHWA-RD-03-052). United States Federal Highway Administration Office of Research, Development, and Technology

*River-Bed Scour.*Canadian Good Roads Assoc Tech Bull

*Live Bed Scour Depth Modelling Around Bridge Pier Using Support Vector Machine*. River Flow2022, University of Ottawa 23–25 November 2022

**50**(6), 445–463

*Hydraulic Engineering Circular No. 18*, NHI 01-001 HEC-18. Federal Highway Administration, US Department of Transportation,

Rep. Prepared for the Laboratoire National d'Hydraulique