Abstract
The accurate prediction of maximum erosion depth in riverbeds is crucial for early protection of bank slopes. In this study, K-means clustering analysis was used for outlier identification and feature selection, resulting in Plan 1 with six influential features. Plan 2 included features selected by existing methods. Regression models were built using Support Vector Regression, Random Forest Regression (RF Regression), and eXtreme Gradient Boosting on sample data from Plan 1 and Plan 2. To enhance accuracy, a Stacking method with a feed-forward neural network was introduced as the meta-learner. Model performance was evaluated using root mean squared error, mean absolute error, mean absolute percentage error, and R2 coefficients. The results demonstrate that the performance of the three models in Plan 1 outperformed that of Plan 2, with improvements in R2 values of 0.0025, 0.0423, and 0.0205, respectively. Among the three regression models in Plan 1, RF Regression performs the best with an R2 value of 0.9149 but still lower than the 0.9389 achieved by the Stacking fusion model. Compared to the existing formulas, the Stacking model exhibits superior predictive performance. This study verifies the effectiveness of combining clustering analysis, feature selection, and the Stacking method in predicting maximum scour depth in bends, providing a novel approach for bank protection design.
HIGHLIGHTS
Feature selection was applied to this study and features were selected that were different from those used in existing studies.
The three regression models demonstrate that the features selected in this study are superior to those selected in existing studies.
The Stacking model was developed and compared with the existing methods, and the results show that the Stacking model is more accurate.
INTRODUCTION
Natural rivers, particularly in mountainous regions, generally exhibit a sinuous course. Even in sections that appear partially straight, the local topography of the riverbed can induce curvature in the main channel. Consequently, investigating the hydrodynamics of these bends and the associated scouring mechanisms has become a pivotal focus within the realm of fluvial dynamics. For instance, numerous roads in mountainous areas are constructed parallel to river banks, where the roadbed acts as an embankment for the meandering mountain river. However, the soil supporting these embankments is susceptible to erosion by the flowing water, ultimately culminating in embankment collapse and subsequent damage to the road infrastructure.
In the realm of concave embankments, the paramount factor engendering profound scour is the phenomenon of bend circulation (Odgaard 1984). When a river meanders through a curvature, it experiences not only the pull of gravity but also the centrifugal forces at play. As a consequence, the surface of the concave embankment rises above that of its convex counterpart. A vertical flow pattern ensues close to the embankments, with water cascading down the concave side and ascending along the convex side, thus giving rise to a circulating motion. This circulation induces the migration of relatively less sandy surface water toward the concave embankment, where it descends vigorously, while the more sandy bottom water gravitates toward the convex embankment, surging upward with fervor. Such dynamics yield an asymmetry in the transportation of sediment, compounded by the erosive action of the flow on the concave embankment, causing the slope to crumble and the resulting detritus to be carried by the subsurface current toward the convex embankment. Consequently, a pronounced scour depth manifests itself on the concave embankment in comparison to the remaining expanse of the riverbed.
There have been several studies on predicting maximum scour depth. Thorne (1989) employed data derived from the Red River in Louisiana to derive an equation; Thorne & Abt (1993) discovered that empirical prognostications yielded superior concordance with measured values compared to estimates based on theoretical analyses of flow dynamics and sediment equilibrium at bends. USACE (1994) furnished a graphical correlation for determining the optimal design scour depth of a bend, while Maynord (1996) presented his equation alongside a secure design curve, having excluded laboratory data. With the advancement of computer software, several techniques have emerged for forecasting the maximum scour depth of bends through computer simulations. In the study by Ling (2006), a BP neural network was employed to predict the maximum scour depth at the bends of rivers. However, the limited amount of data used and the training model's limited accuracy pose constraints. Nevertheless, the obtained results still outperform empirical formulas. Rousseau et al. (2016) conducted a comparative analysis of six widely utilized numerical simulation methods and observed that despite their similar computational speeds, these simulations failed to yield accurate water depths and did not ensure result precision. In a recent study, Froehlich (2020) harnessed 202 sets of measured data from Maynord, Jackson, Thorne, and Abt to develop an artificial neural network (ANN) model capable of predicting the utmost scour depth at bends in sandy riverbeds and introduced a novel approach to establish an upper threshold for the maximum scour depth.
These methodologies exhibit a certain degree of applicability to the intricate realm of dataset rivers; however, they are not without their limitations. Primarily, these methodologies suffer from an insufficient consideration of variables and fail to elucidate the magnitude of influence each variable bears on the maximum scour depth. Moreover, these methodologies rely on dimensionless analysis, which, while convenient for analytical purposes, may engender a loss of autonomy among individual variables. These constraints potentially undermine the prognostic capacity of said methodologies.
Over the past few years, the utilization of artificial intelligence (AI) in various engineering domains has been prominent for the development of predictive models encompassing diverse natural variables. Ehteram et al. (2020) integrated the multilayer perceptron (MLP) model with colliding bodies' optimization (CBO). In their study, sediment size, wave characteristics, and pipeline geometry were employed as inputs for the proposed models. The MLP-CBO model outperformed regression models and empirical models in predicting pipeline scour rates. Parsaie et al. (2021) established multiple models including support vector machine (SVM) and multivariate adaptive regression splines (MARS) to predict the piezometric head and seepage discharge in an earth dam. The results showcased excellent performance of these models in prediction, particularly, the MARS model exhibiting the highest accuracy. Tofiq et al. (2022) utilized various AI techniques to develop several highly accurate prediction models for river streamflow in the Aswan High Dam.
In this study, several methods of machine learning are used in the prediction of maximum scour depths in river bends: K-means clustering analysis was adopted to identify and remove outliers, and conducted feature selection to identify the six most influential features on maximum erosion depth, referred to as Plan 1. Furthermore, to highlight the importance of feature selection, features selected by existing methods as Plan 2 were also included in the subsequent study for comparison. Along with developing three traditional regression models, Support Vector Regression (SVR), Random Forest Regression (RF Regression) and eXtreme Gradient Boosting (XGBoost), the Stacking method was introduced and a Stacking model was built with the aim of improving prediction accuracy. Furthermore, to ensure the independence of each variable, the data undergo a process of normalization.
METHODS
Existing formulas and methods
Some of the existing methods for predicting maximum scour depth are listed here.
Database
Sample data from 230 river measurement sets can be obtained from Thorne & Abt (1993). Each set consists of nine features, namely Rc, Mw, W, Dmnc, v, I, f, Q, and S, with one output variable – maximum scour depth Dmxb. Table 1 displays the statistics of these variables.
. | Rc . | Mw . | W . | Dmnc . | v . | I . | f . | Q . | S . | Dmxb . |
---|---|---|---|---|---|---|---|---|---|---|
Mean | 1,474.15 | 4,312.24 | 588.56 | 3.57 | 1.81 | 1.89 | 0.14 | 3,222.27 | 1.35 | 7.90 |
Std | 3,188.75 | 8,094.28 | 1,580.27 | 1.70 | 0.54 | 2.76 | 0.21 | 6,320.54 | 0.39 | 4.47 |
Minimum | 3.48 | 24.36 | 4.40 | 0.42 | 0.50 | 0.05 | 0.00 | 7.10 | 1.01 | 0.81 |
Maximum | 21,250.00 | 47,600.00 | 8,490.00 | 6.94 | 4.55 | 21.47 | 2.69 | 29,500.00 | 5.32 | 21.25 |
. | Rc . | Mw . | W . | Dmnc . | v . | I . | f . | Q . | S . | Dmxb . |
---|---|---|---|---|---|---|---|---|---|---|
Mean | 1,474.15 | 4,312.24 | 588.56 | 3.57 | 1.81 | 1.89 | 0.14 | 3,222.27 | 1.35 | 7.90 |
Std | 3,188.75 | 8,094.28 | 1,580.27 | 1.70 | 0.54 | 2.76 | 0.21 | 6,320.54 | 0.39 | 4.47 |
Minimum | 3.48 | 24.36 | 4.40 | 0.42 | 0.50 | 0.05 | 0.00 | 7.10 | 1.01 | 0.81 |
Maximum | 21,250.00 | 47,600.00 | 8,490.00 | 6.94 | 4.55 | 21.47 | 2.69 | 29,500.00 | 5.32 | 21.25 |
Rc, centerline radius of bend; Mw, meander wavelength; W, water surface width at the upstream end of bend; Dmnc, mean channel depth at upstream crossing; v, cross-section average velocity at the upstream crossing point for bankfull flow conditions; I, slope; f, friction factor; Q, quantity of flow; S, sinuosity; Dmxb, maximum water depth in bend.
Data outlier processing
To determine the optimal number of clusters, the elbow method is employed by plotting the relationship between the k-value and the SSE, as shown in Figure 2. The inflection point on the curve occurs at k = 2, indicating that increasing k beyond this point will have little impact on decreasing SSE. Thus, the optimal number of clusters is determined to be 2. After performing cluster analysis, it is observed that 218 out of the original 230 datasets belong to one class, while the remaining 12 datasets form another class. It is possible that these 12 datasets, all from the lower Ganges, either possess unique characteristics that differ from the rest of the data and thus belong to a separate class or are outliers due to measurement errors or other factors. Therefore, only the 218 datasets belonging to the main class are analyzed.
To mitigate the issues of redundant information and poor generalization caused by an excessive number of features, feature selection is imperative (Li et al. 2017). This study employs two techniques, namely, Pearson's correlation analysis and feature importance assessment, to identify the features that have the most significant impact on maximum scour depth. These methods are based on distinct principles, thereby preventing the limitations of a single linear correlation evaluation. By combining the results of both calculations, the most crucial features are selected for further analysis.
Pearson's correlation analysis of features
Further analysis revealed that among the relationships between Dmxb and all the features, Rc, Mw, W, Dmnc, and Q were positively correlated with Dmxb, while v, I, f, and S were negatively correlated. Features with an absolute correlation coefficient between 0 and 0.4 were considered weakly or not correlated, indicating that their influence on Dmxb is minimal. Therefore, based on the results of the Pearson correlation analysis, it may be appropriate to simplify the model by discarding v, f, and S features.
Assessment of the importance of features
The selection of features was carried out through a combination of two methods, namely, Pearson's correlation analysis and Extra-Trees for feature importance assessment. The top six features that exhibited the greatest influence on the dependent variable Dmxb were selected in order of their significance, which included Q, W, Dmnc, Rc, Mw, and I. The standardized values of the six selected features were recorded as input values for Plan 1, while the features considered in existing studies including Rc/B, W/Dmnc, and were recorded as input values for Plan 2, and the output values for both Plan 1 and Plan 2 are calculated as Dmxb/Dmnc.
Regression modeling theory
In this study, three regression algorithms, namely, SVR, RF Regression, and XGBoost, are utilized as training models. The computational principles of these models are presented below.
SVR theory
Solve Equation (7) using the Lagrange multiplier method, leading to the pairwise form.
RF Regression theory
RF Regression (Breiman 2001), a parallel integrated algorithm, employs a decision tree as its base learner. Implementing the idea of bagging and random subspace method (RSM) for the randomization of sample selection and feature selection, it enhances the generalization ability of the model compared to a separate decision tree.
XGBoost theory
Grid search is a method of finding the optimal model parameters by traversing a given combination of parameters. The optimal parameters for the three regression models were determined through a comprehensive grid search method, with the selected parameters listed in Table 2.
Model . | Parameter setting . | |
---|---|---|
Plan 1 . | Plan 2 . | |
SVR | C = 50 | C = 135.5 |
gamma = 0.05 | gamma = 0.05 | |
epsilon = 0.35 | epsilon = 0.3 | |
kernel = "rbf" | kernel = "rbf" | |
RF Regression | max_depth = 11 | max_depth = 5 |
n_estimators = 20 | n_estimators = 10 | |
XGBoost | max_depth = 3 | max_depth = 11 |
n_estimators = 10 | n_estimators = 40 | |
learning_rate = 0.5 | learning_rate = 0.5 |
Model . | Parameter setting . | |
---|---|---|
Plan 1 . | Plan 2 . | |
SVR | C = 50 | C = 135.5 |
gamma = 0.05 | gamma = 0.05 | |
epsilon = 0.35 | epsilon = 0.3 | |
kernel = "rbf" | kernel = "rbf" | |
RF Regression | max_depth = 11 | max_depth = 5 |
n_estimators = 20 | n_estimators = 10 | |
XGBoost | max_depth = 3 | max_depth = 11 |
n_estimators = 10 | n_estimators = 40 | |
learning_rate = 0.5 | learning_rate = 0.5 |
The evaluation of each regression model's test results was conducted using a set of commonly used indicators, and the definitions of these indicators are expounded below, where yi represents the observed value, denotes the anticipated value of the model, and n refers to the size of the sample. Smaller values of root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) indicate higher predictive accuracy of the model (Handelman et al. 2019; Abed et al. 2023).
- (1)
- (2)
- (3)
- (4)
Stacking fusion model
The present investigation employs a Stacking approach (Pavlyshenko 2018) to formulate a fusion model that amalgamates the advantages of three distinct models, thereby enhancing the prognostication of maximum scour depth.
The pre-processed data are standardized, and the training and test sets are randomly split in an 8:2 ratio. The training set is then fed into the base learner model for training. To address the issue of small sample size, cross-validation can be utilized to expand the data as the base learner trains.
This study employs a fivefold cross-validation approach (Berrar 2019), whereby the training data are partitioned into five subsets, with four subsets used for training and one for validation. This process is repeated five times to ensure the robustness of the model. During cross-validation, the validation set is not used for training, and the prediction results on this set are used to evaluate the generalization ability of the model. The resulting predictions from each of the five models correspond to five subsets of the original training set, thereby enabling all the data to be predicted simultaneously upon the completion of cross-validation. Each base learner is trained separately and outputs its respective predictions, which are then merged to form a new feature matrix comprising 186 rows and 3 columns as the training set for the meta-learner. During each cross-validation, a prediction is obtained for the test dataset. The predictions from the five cross-validations are then averaged, and the resulting predictions from the test sets of the three models are combined into a new matrix, which serves as the test set features for the meta-learner. Following the training of the base learners, the newly generated feature matrix is fed into the meta-learner for further training.
The two models, designated as Plan 1 and Plan 2, were trained independently, with their respective parameters outlined in Table 3.
Model . | Parameter setting . | |
---|---|---|
Plan 1 . | Plan 2 . | |
FNN | Activation = ‘sigmoid’ | Activation = ‘sigmoid’ |
Number of hidden layers:3 | Number of hidden layers:3 | |
Number of neurons per layer: 8, 6, 4, 1 | Number of neurons per layer: 8, 6, 4, 1 |
Model . | Parameter setting . | |
---|---|---|
Plan 1 . | Plan 2 . | |
FNN | Activation = ‘sigmoid’ | Activation = ‘sigmoid’ |
Number of hidden layers:3 | Number of hidden layers:3 | |
Number of neurons per layer: 8, 6, 4, 1 | Number of neurons per layer: 8, 6, 4, 1 |
RESULTS AND DISCUSSION
Results
Comparison of each model results
The anticipated maximum scour depths were computed individually for each base learner and the Stacking fusion model and subsequently juxtaposed with the measured maximum scour depths. The output value of the model is Dmxb/Dmnc, which enables the prediction of Dmxb and thereby the assessment of each model's performance in predicting Dmxb. The outcome of each evaluation metric is presented in Table 4.
Model . | RMSE . | MAE . | MAPE . | R2 . | ||||
---|---|---|---|---|---|---|---|---|
Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | |
SVR | 1.1528 | 1.1650 | 0.9058 | 0.8615 | 0.1123 | 0.1111 | 0.8845 | 0.8820 |
RF Regression | 0.9896 | 1.2106 | 0.7768 | 0.9739 | 0.0927 | 0.1232 | 0.9149 | 0.8726 |
XGBoost | 1.3954 | 1.4777 | 1.0183 | 1.1226 | 0.1197 | 0.1468 | 0.8307 | 0.8102 |
Stacking model | 0.8385 | 0.9591 | 0.6647 | 0.7537 | 0.0819 | 0.0999 | 0.9389 | 0.9200 |
Model . | RMSE . | MAE . | MAPE . | R2 . | ||||
---|---|---|---|---|---|---|---|---|
Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | Plan 1 . | Plan 2 . | |
SVR | 1.1528 | 1.1650 | 0.9058 | 0.8615 | 0.1123 | 0.1111 | 0.8845 | 0.8820 |
RF Regression | 0.9896 | 1.2106 | 0.7768 | 0.9739 | 0.0927 | 0.1232 | 0.9149 | 0.8726 |
XGBoost | 1.3954 | 1.4777 | 1.0183 | 1.1226 | 0.1197 | 0.1468 | 0.8307 | 0.8102 |
Stacking model | 0.8385 | 0.9591 | 0.6647 | 0.7537 | 0.0819 | 0.0999 | 0.9389 | 0.9200 |
The predicted maximum scour depths for each base learner and the Stacking fusion model were calculated separately and compared with the measured maximum scour depths. The evaluation metrics used to assess the performance of each model included RMSE, MAE, MAPE, and R2. In essence, the predictive performance of a model is deemed better as the coefficient of determination approaches 1 or lower than its RMSE, MAE, and MAPE values.
Table 4 presents the results of the evaluation metrics for all models in Plan 1, with coefficients of determination exceeding 0.8 for all models. The RF Regression model achieved the highest accuracy with an R2 value of 0.9149, followed by the SVR model with an R2 value of 0.8845, while the XGBoost model had the lowest R2 value of 0.8307. In Plan 2, the XGBoost model also had the lowest R2 value among the SVR, RF Regression, and XGBoost models.
Although overall the four evaluation metrics yield consistent results for assessing model performance, there are a few exceptions: from the RMSE and R2 metrics of the SVR model in Plan 1 and Plan 2, it can be observed that its predictive performance of Plan 1 is superior to that of Plan 2. However, its MAE and MAPE values of Plan 1 are larger than those of Plan 2, contradicting the previous conclusion. This discrepancy can be attributed to the fact that MAE and MAPE reflect the model's mean error on the sample, whereas R2 and RMSE focus more on the fit of the model to the sample variance.
The Stacking model, an amalgamation of the SVR, RF Regression, XGBoost models, along with the FNN model, was meticulously examined in comparison to the SVR, RF Regression, and XGBoost models individually. Evidently, the Stacking model exhibits superior prediction accuracy when compared to the three underlying base learner models. This is substantiated by its R2, which attains a remarkable value of 0.9389. Notably, this surpasses the R2 value of the most proficient base learner, RF Regression, by 0.0240 and exceeds that of the least proficient base learner, XGBoost, by a substantial margin of 0.1082. Furthermore, the values of the three error metrics for the Stacking model are also comparatively smaller than those obtained from the base learners, thereby underscoring the evident optimization achieved through the employment of the Stacking model.
Comparison of the Stacking model and other methods
. | R2 . | RMSE . | MAE . | MAPE . |
---|---|---|---|---|
Thorne formula | 0.8851 | 1.1900 | 0.8806 | 0.1281 |
Maynord formula | 0.9253 | 0.9594 | 0.7674 | 0.1013 |
ANN model | 0.9227 | 0.9760 | 0.8160 | 0.1122 |
Stacking model | 0.9456 | 0.8189 | 0.6475 | 0.0826 |
. | R2 . | RMSE . | MAE . | MAPE . |
---|---|---|---|---|
Thorne formula | 0.8851 | 1.1900 | 0.8806 | 0.1281 |
Maynord formula | 0.9253 | 0.9594 | 0.7674 | 0.1013 |
ANN model | 0.9227 | 0.9760 | 0.8160 | 0.1122 |
Stacking model | 0.9456 | 0.8189 | 0.6475 | 0.0826 |
In the scatter plot depicting predicted values against measured values, the proximity of each data point to the y = x line indicates the similarity between the predicted and measured values. Among the three other methods, the Thorne formula exhibits the greatest deviation from the line of perfect fit, represented by y = x. Conversely, the scatter points corresponding to the ANN model and the Maynord formula exhibit closer proximity to the y = x line when compared to those generated by the Thorne formula on the whole.
The Stacking model outperforms all other methodologies across the 35 datasets, as its predicted values demonstrate the highest degree of concurrence with the measured values overall.
The presented bar chart exhibits the assessment indicators for all the different forecasting methods. Through rigorous numerical analysis, it is evident that the Stacking model surpasses all others in terms of forecasting accuracy. With R2 exceeding 0.94, this model demonstrates exceptional predictive capabilities. Following closely behind is the Maynord formula, boasting an R2 greater than 0.92 and approaching 0.93, while the ANN model, although slightly inferior, demonstrates comparable performance. Conversely, the Thorne formula fails to achieve an R2 surpassing 0.9.
When considering the three error indicators, namely, RMSE, MAE, and MAPE, it becomes apparent that the Stacking model yields the lowest value for each of these metrics among all the models and formulas. The Maynord formula, the ANN model, and the Thorne formula subsequently follow suit, albeit with marginally higher error indicators. Notably, these three error indicators align harmoniously with the aforementioned R2, further substantiating the models' predictive efficacy.
Discussion
The comparison between Plan 1 and Plan 2 reveals that the predictive performance of each model in Plan 1 surpasses that of Plan 2 generally, with the most significant improvement observed in the RF Regression model, where the coefficient of determination has increased by 0.0423. This could be attributed to several factors. First, the existing formulas and methods have undergone a process between the features Rc/B, W/Dmnc, and , resulting in reduced independence of each feature. Secondly, these features may not be the most appropriate indicators of maximum scour depth, and there could be other factors that exert a greater influence on maximum scour depth, which are not taken into account. To address this issue, this study conducted a comprehensive analysis based on Pearson's correlation analysis and the Extra-Trees model and selected a few key features as independent variables, which were standardized and entered into the regression models. Notably, each regression model predicted maximum scour depth more accurately than the features selected by the existing methods.
Among the quartet of regression models considered in this study, namely, the Stacking model, SVR model, RF Regression model, and XGBoost model, the Stacking model exhibits the highest R2 value at an impressive 0.9389 in Plan 1 notably. This represents a significant improvement of 0.0240 compared to the leading RF Regression model among the base learners and a remarkable enhancement of 0.1082 over the weakest performing XGBoost model. Thus, it is evident that the Stacking model demonstrates superior predictive capabilities.
Both Plan 1 and Plan 2 corroborate the efficacy of employing the FNN model as the meta-learner for the Stacking model in this investigation. The Stacking model amalgamates the strengths of multiple models, yielding enhanced prediction outcomes.
Furthermore, the SVR model of Plan 1 exhibits comparable predictive performance to that of Plan 2. However, the predictive performance of the other three models surpasses that of Plan 2. Specifically, the RF Regression model, XGBoost model, and Stacking model achieved an increase in the R2 value of 0.0423, 0.0205, and 0.0189, respectively. The comparison between Plan 1 and Plan 2 demonstrates the significance of feature selection in predicting maximum scour depth. This further highlights the fact that several pivotal features, meticulously chosen through Pearson's correlation analysis and the Extra-Trees model, enable each regression model to more accurately predict the maximum scour depth in contrast to the features selected by existing prediction methods.
In comparison to the existing formulas, the Stacking model demonstrates superior predictive performance among the 35 samples. Its R2 outperforms the best-performing Maynord formula by 0.0203, surpasses the moderately performing ANN model by 0.0229, and exceeds the comparatively weakest Thorne formula by 0.0605. This conclusion emerges from a careful consideration of both intuitive observations and rigorous evaluation criteria. Notably, this exceptional performance can be attributed to the astute selection of pivotal features for analysis as well as the proficient utilization of the Stacking method, which seamlessly amalgamates the strengths inherent in multiple regression models.
CONCLUSION
In the present investigation, a rigorous outlier detection approach utilizing K-means cluster analysis was employed to identify and remove outliers from a total of 230 datasets, resulting in 218 remaining datasets for further analysis. Subsequently, through meticulous consideration, six distinctive features (namely, Q, W, Dmnc, Rc, Mw, and I) were selected from a pool of nine features (Rc, Mw, W, Dmnc, v, I, f, Q, and S) using Pearson's correlation analysis and Extra-Trees feature importance evaluation. Notably, these chosen features differ from those examined in previous studies, which are Rc/B, W/Dmnc, and . To predict the maximum scour depth accurately, three regression algorithms, namely, SVR, RF Regression, and XGBoost, were employed for training and prediction purposes. To enhance the predictive accuracy, a Stacking model was constructed, with the aforementioned three models serving as base learners and the FNN model functioning as the meta-learner. The standardized values of the six selected features were recorded as input values for Plan 1, while the features considered in existing studies were recorded as input values for Plan 2, facilitating a comprehensive comparative analysis.
The evaluation of the training models, Plan 1 and Plan 2, involved the assessment of their prediction results using four metrics: RMSE, MAE, MAPE, and R2. A total of 44 datasets out of the original 218 were utilized for this purpose. Furthermore, the prediction outcomes of the Stacking models, derived from both Plan 1 and Plan 2, were compared against three other models: SVR, RF Regression, and XGBoost. Among the three fundamental models, the RF Regression model performs the best in Plan 1, while the SVR model performs the best in Plan 2. For Plan 1, the value of R2 of the RF Regression model is 0.9149, and the RMSE, MAE, and MAPE values are 0.9896, 0.7768, and 0.0927, respectively. For Plan 2, the R2 value of the SVR model is 0.8820, and the RMSE, MAE, and MAPE values are 1.1650, 0.8615, and 0.1111, respectively. On the other hand, the worst-performing model is the XGBoost model, with R2 values of only 0.8307 for Plan 1 and 0.8102 for Plan 2. However, both in Plan 1 and Plan 2, the performance of these models falls short compared to the Stacking model. The Stacking model has R2 values of 0.9389 and 0.9200 for Plan 1 and Plan 2, respectively. The corresponding RMSE, MAE, and MAPE values are 0.8385 and 0.9591, 0.6647 and 0.7537, and 0.0819 and 0.0999, respectively.
Upon comparing Plan 1 and Plan 2, it was observed that the prediction results obtained from each base learner within Plan 1, as well as the Stacking model itself, outperformed those of Plan 2. This superiority can be attributed to the more comprehensive set of features considered in Plan 1, which surpassed the limited scope of features in Plan 2.
The Stacking model was compared with the other methods, and the prediction results for the 35 datasets in the test set showed that the Stacking model outperformed the other methods, with a R2 of 0.9490, while the other methods' R2 were all below 0.93. This superior performance can be attributed, in part, to the removal of outliers through K-means clustering analysis prior to model construction. By reducing the interference of outliers during model training, the accuracy of predictions was improved. Another factor contributing to the Stacking model's success is the careful selection of six features: Q, W, Dmnc, Rc, Mw, and I. These features were chosen based on Pearson's correlation analysis and Extra-Trees feature importance assessment. It was found that these features have a more comprehensive impact on the maximum flush depth compared to other methods. Their inclusion in the model greatly enhanced its predictive capabilities. Furthermore, the Stacking method employed in this study incorporates the advantages of multiple regression models. This makes it particularly well-suited for predicting the maximum scour depth. By leveraging the strengths of different regression models, the Stacking method combines their merits and yields more accurate predictions. However, the model accuracy in this study has not yet reached a completely satisfactory level. The fundamental reason is that the database contains only a few hundred samples, which is not sufficient. In addition, although there are nine original features in the samples, other important factors such as the properties of the soil at river bends were not taken into account. Therefore, if there were a larger database available in the future, the method employed in this study should be able to provide more accurate predictions.
In conclusion, the combination of K-means clustering, feature selection, and Stacking method for predicting the maximum scour depth of bends has yielded promising results. This novel approach not only provides valuable insights into the design of concave bank protection but also offers a new avenue for future research in this field.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.