Abstract
Scouring around the piers, especially in cohesive bed materials, is a fully stochastic phenomenon and a reliable prediction of scour depth is still a challenging concern for bridge designers. This study introduces a new stochastic model based on the integration of Group Method of Data Handling (GMDH) and Generalized Likelihood Uncertainty Estimation (GLUE) to predict scour depth around piers in cohesive soils. The GLUE approach is developed to estimate the related parameters whereas the GMDH model is used for the prediction target. To assess the adequacy of the GMDH-GLUE model, the conventional GMDH and genetic programming (GP) models are also developed for evaluation. Several statistical performance indicators are computed over both the training and testing phases for the prediction accuracy validation. Based on the attained numerical indicators, the proposed GMDH-GLUE model revealed better predictability performance of pier scour depth against the benchmark models as well as several gathered literature studies. To provide an informative comparison among the proposed techniques (i.e. GMDH-GLUE, GMDH, and GP models), an improvement index () is employed. Results indicated that the GMDH-GLUE model achieved = 6% and = 3%, demonstrating satisfying performance improvement in comparison with the previously proposed GMDH model.
INTRODUCTION
Since the 1950s, several field surveys have clarified that scour development around piers is the main cause of bridge collapses across the globe and, due to the importance of this issue, a large amount of research has been performed to study the pier scour phenomenon (Breusers et al. 1977; Chiew & Melville 1987; Ferraro et al. 2013). Briefly, construction of a bridge on a river alters the flow pattern, resulting in the development of two main types of vortices (i.e. horseshoe and wake vortices). Horseshoe vortices extend like a necklace at the base of the pier front, while the development and shedding of the wake vortices occur downstream of the pier. Both experimental and numerical studies on the turbulent flow field around the piers confirmed that the shape, dimensions and location of these vortices may significantly vary during time (Dargahi 1990; Kirkil et al. 2008). The scouring process initiates when such stochastic phenomena interact with bed sediment. It is worthwhile noting that the sediment transport is also stochastic and can be estimated through some statistical approaches (Dodaro et al. 2014, 2016). Therefore, it is quite reasonable to consider the scouring process entirely as a complex stochastic phenomenon.
Complicated scour development is usually experienced when the bridge piers are found in cohesive soils. For granular non-cohesive soil particles, the submerged density of the soil particles and the gravity forces are the major resistance agents to the sediment motion, while in cohesive sediments the physico-chemical characteristics play a critical role. In fact, scouring occurs when the fluid shear on the erodible bed exceeds the physico-chemical forces between the cohesive bed particles and the submerged unit weight of the sediment particles (Rambabu et al. 2003). Therefore, scour depth around a pier in cohesive bed materials is associated with higher uncertainty than in non-cohesive ones (Chang et al. 2004; Firat & Gungor 2009).
The use of semi-empirical regression-based equations is the most common method applied for scour estimation at bridge piers. In these equations, the ultimate pier scour depth is defined as a function of flow properties and sediment characteristics (Breusers et al. 1977; Melville 1997; Arneson et al. 2012). In spite of extensive laboratory studies, owing to the existence of a complex flow field around the pier foundation and unknown influencing parameters on scour development, the regression-based equations have not always provided promising predictions (Gaudio et al. 2010, 2013; Tafarojnoruz 2012). This is due to the high stochasticity phenomena of the scouring problem. Recently, the implementation of artificial intelligence (AI) methodologies has been considered as a reliable alternative to the existing conventional equations to solve complicated problems when a simple equation may not represent the whole complexity of a phenomenon (Najafzadeh et al. 2017, 2018; Sharafati et al. 2019a, 2019c). Regarding the pier scour depth issue, researchers have already reported several AI models such as artificial neural networks (ANNs) (Toth & Brandimarte 2011), genetic programming (GP) (Azamathulla et al. 2010), linear genetic programming (LGP) (Guven et al. 2009), group method of data handling (GMDH) (Najafzadeh & Barani 2011) and model tree (MT) (Etemad-Shahidi & Ghaemi 2011; Etemad-Shahidi et al. 2015) in which better modeling methodologies are demonstrated in comparison with the empirical equations. However, several limitations still exist, particularly the stochastic intrinsic problem. Hence, some prediction inaccuracies relating to the stochastic issues still exist and hydraulic scientists need to tackle these problems to improve predictions.
It is noteworthy that none of the above approaches take into account the stochastic behavior of the scouring process. In fact, researchers assume scour development is a deterministic phenomenon. On the contrary, owing to the random nature and complexity of the scouring process, expected uncertainties may lead to an unavoidable risk in pier scour estimation during foundation design (Yanmaz 2002). Therefore, the use of a probabilistic framework to evaluate the likelihood of achieving different scour depths is essential to estimate failure probabilities due to excessive scour depth (Johnson & Dock 1998). Variable uncertainties that may originate due to the difficulties of an accurate measurement should be predicted, aimed at quantifying the risk of pier failure (Yanmaz 2001).
Quantifying the uncertainty of model parameters can be accomplished through several approaches. Among them, generalized likelihood uncertainty estimation (GLUE), developed in the 1990s (Beven & Binley 1992), has been adopted by several researchers for uncertainty analysis in hydraulics and water resource engineering. For example, it was recently successfully used to assess sediment yield in a watershed (Ayele et al. 2017), analysis of the groundwater transport time (Zell et al. 2018) and increasing the accuracy level in the prediction of pipeline scour depth (Sharafati et al. 2018). In this method, it is not necessary to maximize or minimize the objective function, while the information concerning various parameter sets is derived from the likelihood measure indices. In the following sections, this technique is described in detail and evaluated for the prediction of scour depth around a pier in cohesive bed materials.
This investigation aims to reanalyze and revise the coefficients and exponents of recently developed scour depth equations of a cylinder founded in cohesive soils taking into account the stochastic methodology GLUE. The selected equations, as described in the following section, were proposed by Najafzadeh et al. (2013) based on the GMDH network. Moreover, in this study, a new equation is extracted utilizing the GP approach and its performance and the accuracy of the novel stochastic equation in estimation of the scour depth is evaluated using common statistical indices during the training and testing phases.
METHODS
Influencing parameters and scour depth prediction formulations
In this study, it was attempted to improve the traditional GMDH method in the estimation of ultimate scour depth at a pier founded in cohesive bed materials using a stochastic approach. In this way, a new hybrid model named GMDH-GLUE is introduced as a novel approach. Further, the accuracy of the newly developed model was checked by comparing it with the GP model. In this section, first the GMDH and GP methods and their formulas are described then the GLUE approach, as well as the GMDH-GLUE model, is presented.
Description of GMDH based scouring model
The GMDH technique is a system identification methodology able to model and predict the behavior of a complicated system according to certain input–output data pairs (Ivakhnenko & Ivakhnenko 1995). Its principle was developed on the basis of heuristic self-organizing taking into account some certain operations such as seeding, rearing, crossbreeding, and selection and rejection of seeds attributable to the input model parameters, and model definition (Ivakhnenko & Ivakhnenko 2000; Amanifard et al. 2008; Najafzadeh & Tafarojnoruz 2016). The capability of this technique in scour depth prediction has been extensively assessed. For instance, the GMDH network was successfully developed to model basin sediment yield (Garg 2015), scour depth downstream of grade control structures (Najafzadeh 2015), stable channel design (Shaghaghi et al. 2017), discharge coefficient of cylindrical weirs (Parsaie et al. 2018), scouring rate under pipelines due to waves (Najafzadeh & Saberi-Movahed 2018). All these investigations and other similar studies prove the merits of the GMDH technique in the estimation of scour depth around river and marine structures.
From Equations (9)–(14), it is revealed that Najafzadeh et al. (2013) employed four non-dimensional parameters , , and to predict non-dimensional scour depth .
Description of GMDH-GLUE based scouring model
To simplify the proposed model, the number of stochastic coefficients is reduced using sensitivity analysis. To this end, the impact of each coefficient on the model performance is measured. Based on the results of sensitivity analysis, it was found that among all considered stochastic coefficients, nine stochastic coefficients (e.g. , , and ) exert a significant effect on maximum scour depth. Therefore, these variables were estimated using the GLUE approach and the other ones were considered by means of the Najafzadeh et al. (2013) formulas. In total, the proposed GMDH-GLUE has nine influencing parameters. To estimate the prior probability distribution function (uniform distribution) of final stochastic variable, the cross-validation method was applied to the employed dataset. The parameters of the extracted prior probability distribution functions are presented in Table 1. To estimate the stochastic coefficients by the GLUE approach, the posterior probability distribution functions of stochastic coefficients were extracted using Equation (15) and methodology, which is presented in Figure 1.
Scouring formula coefficients . | PDF . | PDF parameters . | |
---|---|---|---|
(Lower limit) . | (Upper limit) . | ||
Uniform | 0.113826 | 0.150535 | |
–55.2 × 10–5 | –41.7 × 10–5 | ||
2.50 × 10–3 | 3.31 × 10–3 | ||
4.643478 | 6.141000 | ||
–0.026680 | –0.020170 | ||
–8.76 × 10–7 | –6.63 × 10–7 | ||
–0.065900 | –0.049830 | ||
–0.077630 | –0.058700 | ||
5.10 × 10–4 | 6.74 × 10–4 |
Scouring formula coefficients . | PDF . | PDF parameters . | |
---|---|---|---|
(Lower limit) . | (Upper limit) . | ||
Uniform | 0.113826 | 0.150535 | |
–55.2 × 10–5 | –41.7 × 10–5 | ||
2.50 × 10–3 | 3.31 × 10–3 | ||
4.643478 | 6.141000 | ||
–0.026680 | –0.020170 | ||
–8.76 × 10–7 | –6.63 × 10–7 | ||
–0.065900 | –0.049830 | ||
–0.077630 | –0.058700 | ||
5.10 × 10–4 | 6.74 × 10–4 |
Description of GP based scouring model
Parameter . | Description of parameter . | Value of parameter . |
---|---|---|
P1 | Maximum symbolic expression tree depth | 5 |
P2 | Maximum symbolic expression tree length | 150 |
P3 | Population size | 1,000 |
P4 | Maximum generations | 50 |
P5 | Mutation probability | 15% |
P6 | Internal crossover point probability | 90% |
Parameter . | Description of parameter . | Value of parameter . |
---|---|---|
P1 | Maximum symbolic expression tree depth | 5 |
P2 | Maximum symbolic expression tree length | 150 |
P3 | Population size | 1,000 |
P4 | Maximum generations | 50 |
P5 | Mutation probability | 15% |
P6 | Internal crossover point probability | 90% |
DATASETS DESCRIPTION
In this study, a total number of 95 experimental datasets were collected from three open source researches (Rambabu et al. 2003; Debnath & Chaudhuri 2010; Najafzadeh & Barani 2011). For analysis and deriving new equations, 70% of the collected data (i.e. 67 data sets) were randomly selected for training the proposed predictive model and the remaining 30% (i.e. 28 data sets) were utilized for the testing phase. A frequency diagram of the employed dataset and statistics of them in both training and testing phases are shown in Figure 3 and Table 3, respectively. Furthermore, Table 4 presents the range of dimensionless groups of the datasets.
Statistic . | Training stage . | Testing stage . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | |
Max | 1.91 | 100 | 1790.65 | 45.84 | 0.44 | 1.71 | 100 | 633.04 | 45.92 | 0.43 |
Min | 0.15 | 20 | 9.23 | 10.70 | 0.07 | 0.17 | 34 | 11.61 | 20.20 | 0.08 |
Ave | 0.86 | 59.21 | 131.89 | 30.38 | 0.30 | 0.79 | 59.79 | 76.68 | 32.88 | 0.28 |
S.D.* | 0.48 | 22.86 | 333.57 | 8.15 | 0.12 | 0.40 | 22.55 | 142.15 | 5.81 | 0.11 |
C.V.+ | 0.56 | 0.39 | 2.53 | 0.27 | 0.40 | 0.51 | 0.38 | 1.85 | 0.18 | 0.40 |
Statistic . | Training stage . | Testing stage . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | |
Max | 1.91 | 100 | 1790.65 | 45.84 | 0.44 | 1.71 | 100 | 633.04 | 45.92 | 0.43 |
Min | 0.15 | 20 | 9.23 | 10.70 | 0.07 | 0.17 | 34 | 11.61 | 20.20 | 0.08 |
Ave | 0.86 | 59.21 | 131.89 | 30.38 | 0.30 | 0.79 | 59.79 | 76.68 | 32.88 | 0.28 |
S.D.* | 0.48 | 22.86 | 333.57 | 8.15 | 0.12 | 0.40 | 22.55 | 142.15 | 5.81 | 0.11 |
C.V.+ | 0.56 | 0.39 | 2.53 | 0.27 | 0.40 | 0.51 | 0.38 | 1.85 | 0.18 | 0.40 |
Note: *S.D.: Standard deviation. + C.V.: Coefficient of variation.
Parameter . | Range . |
---|---|
Cp (%) | 20–100 |
(–) | 9–1791 |
IWC (%) | 11–46 |
(–) | 0.07–0.44 |
Parameter . | Range . |
---|---|
Cp (%) | 20–100 |
(–) | 9–1791 |
IWC (%) | 11–46 |
(–) | 0.07–0.44 |
DESCRIPTION OF EMPLOYED PERFORMANCE INDICES
APPLICATION RESULTS AND DISCUSSION
The main objective of this study is to make an enhancement on the published research by Najafzadeh et al. (2013), in which the scour depth around a pier in cohesive soil is predicted by the GMDH technique. Due to the high stochasticity of the targeted variable in response to various random variables , the authors were inspired to adopt a reliable stochastic modeling strategy (i.e. GLUE) integrated with GMDH as a tuning procedure. In addition to the proposed GMDH-GLUE, an outstanding evolutionary computing GP model was developed for validation purposes.
Development of GMDH-GLUE model for estimating of scour depth
To develop a GMDH-GLUE model for estimating the scour depth around a cylinder founded in cohesive sediments, the significant stochastic parameters (e.g. , , and ), as shown in Equations (16)–(18), were employed as basic stochastic variables. To estimate the stochastic parameters, uniform probability distributions were employed as the prior distribution of the mentioned parameters (Table 1). Further, the GLUE method was employed to extract the posterior distributions of the variables by performing 10,000 simulations (Figure 4).
From Figure 4 it is revealed that the and (coefficients of and ) have positive skewness. Therefore, the performance of scour depth prediction is increased using the lower value of these parameters. Also, the negative skewness values of and implies that the accuracy of scour depth prediction can be increased using large values of these parameters resulting in increasing the accuracy of scouring prediction.
Assessment of GMDH-GLUE performance
The performance of the proposed GMDH-GLUE model was validated using an uncertainty potential limit prior to the modeling phase during the training learning process. The criterion of 95PPU and P-factor were examined as reported in Figure 4. In accordance with the established research by Abbaspour et al. (2007), the computed value of 95PPU is considered using a minimal percentage of 50 for the modeled observations in the 95PPU bound, which is an acceptable range. On the basis of the GMDH-GLUE model results over the training phase (Figure 5), nine observations out of 67 were located out of the 95PPU bound with P-factor magnitude (87%). In general, GMDH-GLUE exhibited a reliable modeling strategy during the training phase.
One of the very common graphical examinations usually performed for prediction evaluation is the scatterplot (see Figure 6). The figure reports an informative presentation of the variance between the observed scouring data and the predicted values. Figure 6(a) and 6(b) generated the scatterplots over the training and testing phases. The figures indicate the square correlation coefficient R2 values where the variance magnitudes are scattered. The R2 value of the GMDH-GLUE model denoted a superior result followed by the evolutionary GP and GMDH-BP models, with quantitative values of approximately R20.84, 0.80 and 0.79, in that order, over the training phase, whereas the testing phase GMDH-GLUE, GP and GMDH-BP models attained R20.82, 0.62 and 0.81, respectively. Markedly, the GMDH-GLUE model displayed the closest variance towards the ideal fit line 45°.
Using the intercept between the correlation coefficient, standard division and the root mean square metrics, the Taylor diagram was generated for more constructive validation of the developed stochastic predictive model. Figure 7(a) and 7(b) shows the Taylor diagram visualization over the training and testing phases of the modeling. In both figures the GMDH-GLUE model evidenced a closer coordinate to the observed benchmark scouring depth value, while in the testing phase the GP model was less accurate in the prediction of the scour depth. Indeed, this is normal due to the lower magnitude achieved for the correlation coefficient as reported in the scatterplot presentation.
A boxplot is generated for the modeling evaluation of the scour depth prediction where the degree of spread in the prediction data is reported in the form of different quartiles (25, 50, 75 and the interquartile range, IQR) (see Figure 8). Based on the numerical magnitudes of the lower (Q25%), median (Q50%) and upper (Q75%) quartiles, the GMDH-GLUE model offered the best performance in comparison with the GMDH-BP and GP models. However, in terms of Q25% the GP model provided more reliable predictions in some cases. These are usual outcomes where statistical models exhibit different behavior from a certain case to another one when the nature of the modeled problem may significantly change. In general, the IQR is an indicator for the data variability proving the capability of the GMDH-GLUE and GP models over the GMDH-BP model.
In numerical presentation (see Table 5), through the absolute error metrics evaluation (RMSE and MAE), the GMDH-GLUE model demonstrated better prediction results over the classical GMDH modelled using a back propagation learning algorithm. On the contrary, the reported quantitative results clarify that the GP model performed adversely. The statistical indices of the GMDH-GLUE model in terms of RMSE and MAE metrics have been improved by 3.23 and 12.11% in the training stage, while those indices were enhanced by 4.02 and 4.76% over the testing phase. It is clear from the statistical presentation of the RMSE, and MAE metrics, that it is not adequate to explain the superiority of the model's accuracy. Hence, another comprehensive metric, i.e. IM = improvement index is computed for the validation of the proposed method. A satisfactory improvement is achieved using the GMDH-GLUE model over the training ( = 6%) and testing ( = 3%) phases, respectively.
Equation . | Train . | Test . | |||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | CC . | IMTrain . | RMSE . | MAE . | CC . | IMTest . | ||
Najafzadeh et al. (2013) | 0.221 | 0.170 | 0.888 | – | 0.182 | 0.150 | 0.905 | – | |
GMDH-GLUE | 0.195 | 0.164 | 0.916 | – | 0.173 | 0.144 | 0.906 | – | |
GP | 0.214 | 0.176 | 0.894 | – | 0.266 | 0.216 | 0.789 | – | |
Improvement (%) | GMDH-GLUE | 12.11 | 3.23 | 3.16 | 6 | 4.76 | 4.02 | 0.11 | 3 |
GP | 3.31 | –3.55a | 0.71 | 0.2 | –46.27 | –44.11 | –12.80 | –34 |
Equation . | Train . | Test . | |||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | CC . | IMTrain . | RMSE . | MAE . | CC . | IMTest . | ||
Najafzadeh et al. (2013) | 0.221 | 0.170 | 0.888 | – | 0.182 | 0.150 | 0.905 | – | |
GMDH-GLUE | 0.195 | 0.164 | 0.916 | – | 0.173 | 0.144 | 0.906 | – | |
GP | 0.214 | 0.176 | 0.894 | – | 0.266 | 0.216 | 0.789 | – | |
Improvement (%) | GMDH-GLUE | 12.11 | 3.23 | 3.16 | 6 | 4.76 | 4.02 | 0.11 | 3 |
GP | 3.31 | –3.55a | 0.71 | 0.2 | –46.27 | –44.11 | –12.80 | –34 |
aNote: the negative value means the GP model has lower performance compared to the GMDH model.
To demonstrate the superiority of the new GMDH-GLUE model, it is validated against the other studies in this context. Here, the RMSE metric was utilized as a determination factor for the evaluation. Several studies were surveyed and their results are tabulated in Table 6. Based on the reported numerical values of the RMSE, the proposed GMDH-GLUE model demonstrated good enhancement over the literature studies. GMDH-GLUE provided prediction enhancement of 70.6% against the ANN model (Kaya 2010), 30.8% against GMDH-BP (Najafzadeh & Barani 2011), 50.5% against the support vector regression (SVR) model (Pal et al. 2011), 4.76% against GMDH (Najafzadeh et al. 2013), 24.7% against the gene expression programming (GEP) model (Muzzammil et al. 2015), and 24.6% against the evolutionary radial basis function neural network (ERBFNN) model (Cheng et al. 2015). Apparently, the proposed GMDH-GLUE model revealed an acceptable prediction accuracy improvement over the literature studies and that demonstrated its potential for comprehending the actual relationship between the influencing parameters and the maximum scour depth.
Literature research . | Current research (GMDH-GLUE) . | Kaya (2010) (ANN) . | Najafzadeh & Barani (2011) (GMDH-BP) . | Pal et al. (2011) (SVR) . | Najafzadeh et al. (2013) (GMDH) . | Muzzammil et al. (2015) (GEP) . | Cheng et al. (2015) (ERBFNN) . |
---|---|---|---|---|---|---|---|
RMSE (testing phase) | 0.173 | 0.59 | 0.25 | 0.35 | 0.182 | 0.23 | 0.23 |
Literature research . | Current research (GMDH-GLUE) . | Kaya (2010) (ANN) . | Najafzadeh & Barani (2011) (GMDH-BP) . | Pal et al. (2011) (SVR) . | Najafzadeh et al. (2013) (GMDH) . | Muzzammil et al. (2015) (GEP) . | Cheng et al. (2015) (ERBFNN) . |
---|---|---|---|---|---|---|---|
RMSE (testing phase) | 0.173 | 0.59 | 0.25 | 0.35 | 0.182 | 0.23 | 0.23 |
CONCLUSIONS AND FUTURE RESEARCH DIRECTION
Scouring phenomena are one of the vital problematic issues in riverine and marine structures and hydraulic engineers need to take into account optimal effective parameters in the design of those structures, ensuring adequacy and accuracy of the prediction results. The current study provided a new integrated stochastic model based on GMDH-GLUE for predicting scour depth around piers using effective variables. In fact, the use of the standard GMDH technique provides deterministic predictions based on a complex relationship between the influencing parameters and the output variable. In the present study, the prediction was improved, taking into account the stochastic behavior of the data that is generally neglected when only the GMDH approach is considered. This investigation proves the acceptable adequacy of the selected stochastic technique, GLUE, to extract the stochastic characteristics of the experimental data. The new concept, GMDH-GLUE, denotes a combined model whose predictions are based on both the stochastic and deterministic properties of the input and output variables.
To validate the selected methodology, the proposed stochastic model was validated against the previously published GMDH model (Najafzadeh et al. 2013) in addition to one reliable evolutionary computing model called genetic programming. The modeling results were authenticated using numerous statistical metrics and graphical evaluation indicators. The findings of this study evidenced the feasibility and accuracy of the adopted GMDH-GLUE model in comparison with the benchmark models. The statistical metric substantiated a satisfactory accuracy improvement of 6 and 3%, respectively, over the train and test modeling phases against the original GMDH model. Overall, the explored stochastic predictive model captured the internal nonlinear mathematical behavior of the influencing parameters and the output variables. Although satisfactory prediction accuracy was achieved utilizing the selected methodology, the modeling strategy would be enhanced through the incorporation of a non-linear mask approach to abstract the mostly correlated input attributes using the potential of the recently explored nature-inspired optimization algorithms.
ACKNOWLEDGEMENTS
The authors would like to extend their gratitude and appreciation to Prof. Amir Etemad Shahidi for offering the employed data set and his constructive comments for developing this research. We would also like to express our profound gratitude to the respected editor Prof. Dimitri Solomatine and two unknown reviewers for sharing their ideas and admirable comments.