## Abstract

The stepped spillway of a dam is a crucial element that serves multiple purposes in the field of river engineering. Research related to flood control necessitates an investigation into the dissipation of energy over stepped spillways. Previous research has been conducted on stepped spillways in the absence of baffles, utilizing diverse methodologies. This study employs machine learning techniques, specifically support vector machine (SVM) and regression tree (RT), to assess the energy dissipation of rectangular stepped spillways incorporating baffles arranged in different configurations and operating at varying channel slopes. Empirical evidence suggests that energy dissipation is more pronounced in channels with flat slopes and increases proportionally with the quantity of baffles present. Statistical measures are employed to validate the constructed models in the experimental investigation, with the aim of evaluating the efficacy and performance of the proposed model. The findings indicate that the SVM model proposed in this study accurately forecasted the energy dissipation, in contrast to both RT and the conventional method. This study confirms the applicability of machine learning techniques in the relevant field. Notably, it provides a unique contribution by predicting energy dissipation in stepped spillways with baffle configurations.

## HIGHLIGHTS

The investigation of flood management necessitates a thorough examination of the dissipation of energy across stepped spillways.

The current research endeavours to examine the dissipation of energy across a rectangular steeped spillway featuring diverse baffle configurations at varying channel inclinations.

Machine learning techniques, specifically support vector machines (SVM) and regression trees (RT), are employed for the purpose of forecasting energy dissipation on steeped spillways with rectangular geometry.

## NOTATION

*H* = height of the spillway

*y*_{c} = critical depth of flow

*a* = distance of baffle from the toe of spillway

*y*_{1} = upstream head

*y*_{2} = downstream head

*h*_{b} = height of baffle

*w*_{b} = width of baffle

*W* = width of spillway

*V* = velocity of flow

*g* = acceleration due to gravity

*H*_{u} = hydraulic depth

*E*_{u} = energy at upstream

*E*_{d} = energy at downstream

*E*_{L} = energy loss

*E*_{T} = total energy

*V*_{1} = upstream velocity

*V*_{2} = downstream velocity

*a* = actual value

*p* = predicted value

*ā* = mean of actual values

= mean of predicted values

*N* = number of data

*k* = number of parameters

= relative spillway height

= relative baffle distance

= relative baffle height

= relative baffle width

= Froude number at upstream

= relative energy loss

## INTRODUCTION

The regulated discharge of water from a reservoir to a lower-lying area is facilitated through the utilization of a hydraulic structure commonly referred to as a spillway. Stepped spillways are utilized as a measure to mitigate the risk of water overflow from dams, which could potentially result in damage or destruction of the structure. The stepped spillway of an invert open channel is achieved through the incorporation of a series of steps or drops. According to Chanson (2002), there existed three discernible categories of flow patterns, namely nappe flow, transition flow, and skimming flow. The phenomenon of energy dissipation in stepped spillways has been extensively studied by various scholars, including Rajaratnam (1990), Chanson (1994), and Pegram *et al.* (1999). In their study, Boes & Hager (2003) examined the advantages of stepped spillways, including their ease of construction, reduced risk of cavitation, and smaller stilling basins required at the downstream dam toes due to significant energy dissipation along the chute. Barani *et al.* (2005) conducted research on a physical wooden model of the Manksvill dam spillway, which was constructed at a scale of 1:25. Hazzab & Chafic (2006) conducted an experimental study on the energy dissipation in stepped spillways and reported on the flow configurations observed. Stefan & Chanson (2009) conducted an investigation on measurements of air-water flow in moderately inclined stepped channels. The impact of steps and step heights on the energy dissipation capacity of stepped spillways was examined by Daniel (2010). This study aims to compare the flow characteristics of the smooth invert chute with those of the self-aerated stepped spillway. In a study by Katourany (2012), a comparison was made between experimental findings and conventional United States Bureau of Reclamation (USBR) outcomes to investigate the impact of different baffle widths, spacings between baffle rows, and step heights of baffled aprons on energy dissipation. The study conducted by Salmasi *et al.* (2012) aimed to evaluate the energy dissipation of gabion stepped spillways through an analysis of their through-flow and overflow. The study revealed that gabion spillways featuring pervious surfaces exhibited superior energy dissipation capabilities compared to those with concrete horizontal or vertical walls, particularly at higher discharge rates. Rad (2014) conducted a quantitative analysis of the dissipation of energy in various types of stepped spillways, including those with inclined steps and end sills. The study conducted by Saedi & Asareh (2014) examined the impact of the quantity of drop stairs on the dissipation of energy in stepped drops. The researchers proposed the use of stepped drops after observing that stairs contribute to a considerable amount of roughness in the flow path, thereby enhancing energy dissipation. According to Al-Husseini's (2015) findings, it was determined that the stepped spillway resulted in a greater dissipation of energy. The researcher's findings indicate that a decrease in the number of steps and downstream slopes resulted in an increase in flow energy dissipation. In comparison to the original step spillway with identical parameters, the utilization of the cascade spillway resulted in a decrease in energy dissipation, whereas the implementation of baffle blocks led to an increase in energy dissipation. The study conducted by Parsaie *et al.* (2016) utilized the Multivariate Adaptive Regression Splines (MARS) technique to approximate the energy dissipation of flow over stepped spillways in the presence of skimming flow conditions. The findings of the study on energy dissipation prediction utilizing MARS and ANN indicate that both models are dependable, with MARS exhibiting a slightly higher level of reliability compared to ANN. Frederic *et al.* (2017) conducted an assessment of the energy dissipation effectiveness of the spillway for the Mekin dam in relation to its capacity. The study confirmed that the flow down the spillway did not result in transitional flow, which could have caused vibrations that would be harmful to the structure. The confirmation of the stability of the spillway was achieved through the computation of safety factors at different intervals. Mojtahedi *et al.* (2020) devised a computational model to investigate the influence of geometrical parameters on the dissipation rate in flows through stepped spillways. The model was subsequently validated through experimentation with a physical model. A specific type of fuzzy inference system (FIS) is utilized to examine the control of dissipation rates. The results are juxtaposed with a pre-established numerical repository to ascertain the anticipated dissipation of energy in diverse scenarios. The results indicate that the proposed flow index system (FIS) has the potential to serve as an effective instrument for the operational management of dissipator structures, while simultaneously considering diverse geometric features. The four phases of the spillway are analysed by Nasralla (2021). In order to augment the dissipation of energy that arises from the contraction stepped spillway, a total of 18 trials were carried out, wherein diverse placements, heights, and widths of baffles were taken into account. The results indicate that the presence of baffles on the stepped spillway located downstream of the stilling basin leads to enhanced dissipation of energy. The study conducted by Ikinciogullari (2021) employed the Flow 3D software to perform a quantitative analysis of the energy dissipation capacities exhibited by trapezoidal stepped spillways. To achieve the intended objective, four discrete models and three distinct discharges were employed. According to the results, the trapezoidal stepped spillway exhibits a higher energy dissipation efficiency of up to 30% compared to conventional stepped spillways.

Salmasi & Abraham (2022) investigated nine physical models of stepped spillways with varying slopes of 15°, 25°, and 45°, and step quantities ranging from 5 to 50. The impact of the spillway slope and the quantity of steps on the rate of energy dissipation is negligible. The augmentation of the spillway slope and the number of steps has been observed to result in a higher degree of energy dissipation in the case of a uniform discharge over a stepped spillway. The study conducted by Chanson (2022) investigates the hydraulic properties of stepped chute flows and presents a critical analysis of almost 30 years of dynamic hydraulic research, incorporating contemporary field measurements obtained during significant flood events. The study conducted by Ma *et al.* (2022) examined the flow characteristics of an interval-pooled stepped spillway. The researchers utilized a combination of the renormalization group (RNG) k–*ε* turbulence model and the volume of fluid (VOF) interface capture technique. The findings indicate that the energy dissipation efficacy of the stepped spillway with interval-pooled configuration was superior to that of the stepped spillways with pooled configuration and the conventional flat-panel stepped spillway. The present study introduces the method of identifying the intensity of the omega vortex for the purpose of evaluating the dissipation of energy. The lack of increase in energy dissipation with the increase in pool height can be attributed to the formation of a ‘pseudo-weir.’ Burgan (2022) conducted research on the utilization of time-lagged streamflow data from a gauging station, employing various artificial neural network (ANN) algorithms and multiple linear regression (MLR) techniques to ascertain its efficacy in accurately predicting flow rates. The feed forward back propagation (FFBP) algorithm has demonstrated superior performance in daily flow prediction studies compared to other techniques. Simultaneously, the utilization of ANN algorithms extends beyond flow prediction, as they can also serve the purpose of water resources management within hydrological basins by approximating extreme occurrences such as floods and droughts. The circular stepped spillway (CSS) was constructed by Ikinciogullari (2023) and subsequently underwent numerical investigation. A comparative analysis was performed to assess the dissipation rate of energy between the CSS and the flat-stepped spillway, utilizing three distinct models and discharges. The simulation results suggest that the CSS performance is enhanced as the step radius decreases. The study conducted by Albank & Khassaf (2023) aims to examine the energy dissipation rate of physical models of conventional steps positioned at downstream angles of 25̊, 35̊, and 45̊. The findings indicate that the dissipation of relative energy loss on pooling steps is approximately 4.6% higher than that on flat steps. The study conducted by Sayed *et al.* (2023) aims to assess and contrast the performance of HEC-HMS and TOPMODEL as white box models, and ANFIS and GEP as black box models, in the simulation of rainfall-runoff. The ANFIS model, a type of black box model, exhibited superior performance compared to the GEP model.

Previously, limited research has been conducted regarding the dissipation of energy over rectangular stepped spillways equipped with baffles. The multivariate regression technique was exclusively employed in these studies to forecast the relative energy loss utilizing non-dimensional parameters. The current investigation involves conducting novel experiments to obtain additional data for precise estimation of energy dissipation on a rectangular stepped spillway with varying baffle configurations at different channel inclinations, expressed in non-dimensional parameters. This is achieved through the application of advanced techniques such as support vector machine (SVM) and regression trees (RTs). The machine learning algorithms SVM and RT possess unique advantages over alternative algorithms, including their efficacy in high-dimensional spaces, resilience to overfitting, versatility in kernel functions, ability to achieve global optima, effectiveness with limited data, interpretability, capacity to model non-linear relationships, capability to handle missing values and outliers, independence from feature scaling, and ability to handle both numerical and categorical data, respectively. Through the utilization of statistical analysis, a comparative evaluation of these methodologies is conducted to determine the efficacy of the generated models in forecasting energy dissipation across rectangular stepped spillways featuring varying baffle configurations.

## MATERIALS AND METHODS

### Data source

Test channel . | Range
. | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|

Nasralla (2021) channel | Minimum | 4.0926 | 0 | 0 | 0 | 1.5223 | 0.0066 |

Maximum | 6.0571 | 8 | 5.98 | 1 | 7.7625 | 0.6689 | |

Average | 4.9481 | 2.1938 | 2.7890 | 0.8961 | 3.7701 | 0.2112 | |

Median | 4.8355 | 1.40 | 2.990 | 1 | 3.2098 | 0.1095 | |

Standard deviation | 0.6964 | 1.9435 | 1.3602 | 0.2608 | 1.8122 | 0.2159 | |

Present channel | Minimum | 4.1322 | 0 | 0.4234 | 0.3333 | 0.1487 | 0.2542 |

Maximum | 12.0827 | 27.2771 | 1.2330 | 0.3333 | 0.3215 | 0.5630 | |

Average | 6.8861 | 8.6822 | 0.7452 | 0.3333 | 0.2622 | 0.3769 | |

Median | 5.8087 | 7.2226 | 0.7028 | 0.3333 | 0.2841 | 0.3816 | |

Standard deviation | 2.90 | 8.8655 | 0.2382 | 0 | 5.8835 | 7.1195 |

Test channel . | Range
. | . | . | . | . | . | . |
---|---|---|---|---|---|---|---|

Nasralla (2021) channel | Minimum | 4.0926 | 0 | 0 | 0 | 1.5223 | 0.0066 |

Maximum | 6.0571 | 8 | 5.98 | 1 | 7.7625 | 0.6689 | |

Average | 4.9481 | 2.1938 | 2.7890 | 0.8961 | 3.7701 | 0.2112 | |

Median | 4.8355 | 1.40 | 2.990 | 1 | 3.2098 | 0.1095 | |

Standard deviation | 0.6964 | 1.9435 | 1.3602 | 0.2608 | 1.8122 | 0.2159 | |

Present channel | Minimum | 4.1322 | 0 | 0.4234 | 0.3333 | 0.1487 | 0.2542 |

Maximum | 12.0827 | 27.2771 | 1.2330 | 0.3333 | 0.3215 | 0.5630 | |

Average | 6.8861 | 8.6822 | 0.7452 | 0.3333 | 0.2622 | 0.3769 | |

Median | 5.8087 | 7.2226 | 0.7028 | 0.3333 | 0.2841 | 0.3816 | |

Standard deviation | 2.90 | 8.8655 | 0.2382 | 0 | 5.8835 | 7.1195 |

### Experimental setup

### Theoretical background

### SVM model

*et al.*2015). SVR exhibits a notable benefit in its capacity to effectively manage non-linear associations between input and output variables. The attainment of this objective is facilitated through the utilization of kernel functions, including but not limited to polynomial, radial basis function (RBF), and sigmoid functions. These functions enable the capture of intricate relationships between variables. SVR has been effectively utilized in diverse domains, including finance, transportation, and medical imaging. It is of utmost significance to meticulously choose and adjust the kernel function and other associated parameters to attain optimal performance. One common method for transforming a linear classifier into a non-linear classifier involves the utilization of a non-linear function to map the input space x onto a feature space F. An alternative approach involves the utilization of a non-linear function for the purpose of mapping. The separating function in space F can be expressed as (Parsaie

*et al.*2015):

The aforementioned kernel parameters are denoted by *C*, *γ*, *r*, and *d*. The relationship between the estimation accuracy, or generalization performance, of the SVM and the quality of the meta-parameters *C*, *γ*, and *r*, as well as the kernel parameters, is widely acknowledged. This assertion holds valid for the kernel parameters. The selection of *C*, *γ*, and *r* parameters plays a crucial role in determining the level of complexity associated with the regression model used for prediction purposes. The task of achieving optimal parameter selection is considerably complicated by the fact that the efficacy of the SVM model, with respect to generalization, is contingent upon all three parameters, thereby increasing its complexity. The utilization of kernel functions is employed to achieve the objective of reducing the dimensionality of the input space for the purpose of performing classification. The SVM model, akin to other neural network models, is constructed through exposure to diverse datasets, thereby enhancing the model's precision. The high-quality data sets from the present study and Nasralla (2021) were used to determine the energy dissipation over rectangular stepped spillway. Acquired data are separated into training and test sets in MATLAB R (2019) before the SVM model can be built. In this study, the modeling procedure uses as the target value and the independent factors as input variables discussed in Equation (6).

### RT model

RTs refer to a type of machine learning algorithm utilized for constructing predictive models from datasets. The methodology of RTs involves the utilization of a clustering tree that undergoes post-pruning processing. The clustering tree algorithm has been cited in several academic papers as both the forecasting clustering tree and the monothetic clustering tree (Vens *et al.* 2010). The utilization of RTs is prevalent in modelling dependent variables that possess a limited number of values that are not arranged in a specific order. The assessment of prediction error is typically conducted by measuring the squared difference between predicted and observed values, as stated by Loh (2011). The algorithm for clustering tree is founded on the decision tree's top-down induction technique, as proposed by Quinlan (1986). The RTs algorithm uses a set of training data to generate an internal node that maximizes its effectiveness. The system selects the highest test scores based on their reduced variance. Clusters with lower variance are considered to be more homogeneous, resulting in more precise forecasts. As per the findings of Vens *et al.* (2010), in case none of the tests lead to a significant reduction in variance, the software produces a leaf and designates it as a representative of data. Breiman (2017) suggests that a hierarchical tree-like division of the input space can be established by recursively dividing the data space and fitting a prediction model within each partition. The process of recursively dividing the input space into smaller local regions is employed, with the resulting splits demarcating the boundaries of these regions. The tree structure comprises both the internal decision nodes and the terminal leaves. Commencing from the root node, a series of evaluations and branching nodes will ascertain the trajectory along the tree until it nears a concluding node, given a test datum. The model associated with a given terminal node is utilized to generate a prediction locally at said node.

### Statistical measures

*R*

^{2}), mean-squared error (MSE), root-mean-squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), ratio of RMSE to the standard deviation of the observations (RSR), Akaike information criterion (AIC), and Nash–Sutcliffe Efficiency (NSE). The relevant equations were utilized to calculate these metrics (Kaushik & Kumar 2023).

## RESULTS AND DISCUSSION

*R*

^{2}values. Figure 11 presents a comparison of the different models that have been developed for the purpose of estimating relative energy loss. In comparison to alternative approaches, the SVM model exhibits a high degree of proximity to the optimal data-fitting line, and displays notable potential for generalization. The absence of symptoms commonly associated with overtraining is evident in their condition. Table 2 presents the results of the evaluation of the proposed models' efficacy, which were assessed using established statistical error metrics, including the

*R*

^{2}, MSE, RMSE, MAE, MAPE, RSR, AIC, and NSE. The results of the study conducted by RT and Nasralla (2021) indicate that the SVM model outperforms all other models in terms of its

*R*

^{2}value of 0.98, RMSE value of 0.032, MAE value of 0.024, MAPE value of 6.75, and RSR value of 0.161. The AIC is a statistical tool employed to determine the optimal model by evaluating the likelihood function. The model that exhibits the lowest AIC value is deemed to be the optimal model. The SVM model demonstrates the minimum AIC value when compared to the alternative models proposed. The NSE is a statistical measure that has been normalized to determine the relative magnitude of residual variance in comparison to the variance of the measured data. The NSE metric is utilized to assess the degree of conformity between the observed and simulated data plots with the 1:1 line. An NSE value of 1 indicates a complete agreement between the model and the observed data. The NSE metric is commonly used to evaluate the accuracy of model predictions. An NSE value of 0 signifies that the model predictions are equivalent in accuracy to the mean of the observed data. Conversely, an NSE value between negative infinity and 0 indicates that the observed mean is a superior predictor compared to the model. The SVM model exhibits a maximum NSE value of 0.985, which suggests a superior fit of the model to the observed data in comparison to other models with lower NSE values. According to the statistical indices analysis, the SVM exhibits superior performance compared to other Artificial Intelligence (AI) methods in forecasting the relative energy loss over the rectangular stepped spillway.

Statistical parameters
. | SVM Model
. | RT Model
. | Nasralla method
. |
---|---|---|---|

R^{2} | 0.98 | 0.94 | 0.84 |

MSE | 0.0009 | 0.0023 | 0.0097 |

RMSE | 0.032 | 0.048 | 0.098 |

MAE | 0.024 | 0.028 | 0.079 |

MAPE | 6.75 | 8.17 | 28.88 |

RSR | 0.161 | 0.238 | 0.331 |

AIC | 158.64 | 296.11 | 352.15 |

NSE | 0.985 | 0.977 | 0.797 |

Statistical parameters
. | SVM Model
. | RT Model
. | Nasralla method
. |
---|---|---|---|

R^{2} | 0.98 | 0.94 | 0.84 |

MSE | 0.0009 | 0.0023 | 0.0097 |

RMSE | 0.032 | 0.048 | 0.098 |

MAE | 0.024 | 0.028 | 0.079 |

MAPE | 6.75 | 8.17 | 28.88 |

RSR | 0.161 | 0.238 | 0.331 |

AIC | 158.64 | 296.11 | 352.15 |

NSE | 0.985 | 0.977 | 0.797 |

## CONCLUSIONS

The present study showcases the application of machine learning methodologies, specifically SVM and RT, for the purpose of computing the energy dissipation over a stepped spillway of a rectangular shape. The proposed models were developed utilizing laboratory datasets of superior quality, which encompassed dimensionless geometric and flow characteristics for rectangular stepped spillways featuring diverse baffle configurations and channel slopes (*θ* = 0° and 1°). The proposed model appears to be influenced by a diverse range of factors, including but not limited to the dimensions of the spillway in terms of width and height, the positioning of the baffle, the upstream and downstream heads, the channel's slope, and the Froude number. The impact of energy dissipation is influenced by the flow rate at different channel slopes and baffle configurations. The experimental findings revealed that the configuration featuring five baffles resulted in a higher dissipation of energy under the condition of a channel inclination of 0°. The loss of energy is influenced by various factors such as the spillway's relative height, the upstream and downstream heads, and the Froude number. Consistent variations have been observed across diverse channel slopes and baffle arrangements. A study was carried out to examine the correlation between the non-dimensional parameters of a rectangular stepped spillway and the corresponding relative energy dissipation. A non-linear relationship has been identified among the various factors. The recently developed models exhibit superior performance in various datasets, as evidenced by their *R*^{2}, MSE, RMSE, MAE, MAPE, RSR, AIC and NSE values, compared to traditional approaches like Nasralla (2021). Based on the findings, it can be inferred that both SVM and RT exhibited a satisfactory level of precision in approximating the quantity of energy dissipated through the rectangular stepped spillway. This was done in accordance with the assessment standards. The SVM model exhibited superior performance due to its optimal *R*^{2}, NSE values, and minimal MSE, RMSE, MAE, MAPE, RSR, and AIC values. A constraint of the current investigation is its applicability solely in forecasting the comparative dissipation of energy across the rectangular stepped spillway. The primary research focus in the future should be on estimating the energy dissipation across different shapes of stepped spillways, such as circular or trapezoidal, while considering the angle of inclination of steps in non-dimensional parameters. This can be achieved through the use of innovative approaches like gene expression programming, and should also include the evaluation of baffle configurations.

## ACKNOWLEDGEMENTS

The authors express their gratitude for the assistance provided by the Department of Civil Engineering at Delhi Technological University, located in Delhi, India.

## DISCLOSURE STATEMENT

The authors reported no potential conflicts of interest.

## FUNDING

Not applicable.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.