The present study represents the first use of kernel-based models to predict discharge coefficient (Cd) for two distinct types of cylindrical weirs, featuring vertical support and a 30-degree upstream ramp. For this purpose, kernel-based methods, including support vector machine, Gaussian process regression (GPR), Kernel extreme learning machine, and Kernel ridge regression, were used, as they offer notable advantages compared to other machine learning models, such as flexibility in handling various data patterns, robustness against overfitting, and effectiveness in high-dimensional data scenarios. The results indicated that the GPR model, with statistical metrics of R = 0.967, Nash–Sutcliffe efficiency (NSE) = 0.935, and root-mean-square error (RMSE) = 0.027, demonstrates superior accuracy in modeling the overall dataset collected from two distinct types of weirs. Through a conducted sensitivity analysis, it was identified that the upstream Froude number is pivotal in accurately predicting the Cd of a cylindrical weir. The modeling conducted for two distinct weir types revealed that a cylindrical weir with vertical support exhibits enhanced predictive capabilities (R = 0.997, NSE = 0.994, and RMSE = 0.007) for Cd. The findings indicate that the introduction of the upstream ramp alters hydraulic conditions, resulting in reduced modeling accuracy (R = 0.760, NSE = 0.529, and RMSE = 0.060).

  • Various kernel-based approaches, including support vector machine, Gaussian process regression (GPR), Kernel extreme learning machine, and Kernel ridge regression, were employed and discussed in detail.

  • Two types of cylindrical weirs, including a cylindrical weir with vertical support and a cylindrical weir with a 30-degree upstream ramp, were investigated.

  • GPR exhibited superior prediction performance compared to other employed kernel-based models.

In recent years, the investigation of weirs, recognized as among the most extensively employed hydraulic structures, has garnered considerable attention from researchers. These highly important hydraulic structures are integral components in a variety of water projects, encompassing hydropower initiatives, irrigation, and drainage systems. Their significance lies in their substantial influence on the overall performance of water systems. Cylindrical weirs, among the diverse categories of weirs, have been utilized since the late 19th and early 20th centuries, predating the introduction of ogee weirs. This preference arose from their stable overflow pattern, facilitation of debris passage, simplified design in comparison to ogee crest designs, and the ensuing cost-effectiveness associated with their implementation. In light of the historical prevalence of cylindrical weirs, extensive studies have been undertaken to systematically classify the hydraulic specifications of cylindrical weirs. Koch et al. (1926) pioneered the first experimental study aimed at investigating the total streamwise force. Ever since, there has been extensive research exploring the impact of a diverse array of flow parameters on the hydraulic characteristics of cylindrical weirs (Rehbock 1929; Jaeger 1956; Escande & Sananes 1959; Matthew 1962; Sarginson 1972; Ramamurthy et al. 1994; Chanson & Montes 1998). In subsequent studies, Heidarpour & Chamani (2006) introduced a mathematical approach for predicting both the velocity distribution over the crest of a cylindrical weir and the discharge coefficient. Their method utilized potential flow past a cylindrical obstacle. The findings indicated that the estimated discharge coefficient closely aligned with experimental values, exhibiting a negligible deviation of within ±5%. Kabiri-Samani & Bagheri (2014) focused on circular-crested weirs, employing a combined potential flow and free vortex flow approach. Results showed that the proposed semi-analytical model accurately predicts discharge coefficients and velocity distributions, with sensitivity analysis highlighting a key influencing parameter. Haghiabi et al. (2018) introduced an analytical approach, combining uniform potential flow with doublets, to investigate flow properties over cylindrical weirs. Experimental validation showed a polynomial variation in the discharge coefficient, demonstrating good agreement with measured data. Shamsi et al. (2022) explored the discharge coefficient of cylindrical weirs and its impact on flow energy dissipation. Their findings identified the optimal discharge coefficient for economic design, approximately 1.3, occurring within the head-to-diameter ratio range of 0.5–0.7. In their recent study, Afaridegan et al. (2023) evaluated and enhanced semi-cylindrical weirs through the incorporation of downstream ramps, resulting in modified semi-cylindrical weirs (MSCWs). The experimentation, involving 12 variations tested both in laboratory settings and through numerical analyses, revealed a substantial influence of slopes on downstream hydraulic characteristics. The research findings demonstrated that the discharge coefficient of MSCWs is contingent on the radii with a negative correlation, remaining independent of the slope. In addition, the study revealed that the reduction of the slope effectively regulates negative pressure on the crest surface. In conjunction with laboratory investigations, an extensive body of research highlights the efficacy of numerical methods, particularly computational fluid dynamics (CFD), in the modeling of hydraulic characteristics associated with cylindrical weirs. Multiple studies have substantiated the successful application of CFD for accurate simulation and analysis of fluid flow patterns, velocity distributions, and pressure profiles surrounding cylindrical weirs (Gholami et al. 2014; Yuce et al. 2015; AL-Dabbagh et al. 2023).

In response to challenges inherent in conventional approaches, characterized by a multitude of influencing parameters, their interdependencies, numerous assumptions, solution complexity, and heightened uncertainty (Hüllermeier & Waegeman 2021), machine learning methods have gained widespread adoption in recent years for solving hydraulic (Nouri et al. 2020; Seyedzadeh et al. 2020; Roushangar et al. 2023a) and hydrological problems (Abed et al. 2023; Corzo Perez & Solomatine 2024). Furthermore, numerous hybrid models have been developed to optimize the hyperparameters of machine learning techniques, aiming to achieve optimal performance in modeling different hydraulic (Roushangar & Shahnazi 2019; Deng et al. 2023) and hydrological (Adnan et al. 2021, 2022, 2023a, 2023b; Mostafa et al. 2023) parameters. In the case of cylindrical weir, Parsaie et al. (2018) employed the group method of data handling (GMDH) in conjunction with particle swarm optimization (PSO) to predict the discharge coefficient of a cylindrical weir-gate. Evaluation against multi-layer perceptron neural network (MLPNN) and support vector machine (SVM) models demonstrated their effectiveness, with SVM exhibiting a marginally higher accuracy. The GMDH model highlighted the upstream Froude number and the ratio of the gate's opening height to the diameter of the cylindrical weir-gate as the pivotal parameters influencing the discharge coefficient. In a separate investigation, Parsaie et al. (2017) demonstrated the superior capability of adaptive neuro fuzzy inference systems (ANFIS) in comparison to the MLPNN model for modeling the discharge coefficient of a cylindrical weir-gate. Ismael et al. (2021) investigated the prediction of discharge coefficients for an oblique cylindrical weir, considering three diameters and three inclination angles. The discharge coefficient values from 56 experiments were estimated using the radial basis function network (RBFN). The study compared RBFN's performance with that of the back-propagation neural network (BPNN) and cascade-forward neural network (CFNN). The findings revealed that RBFN exhibited superior performance compared to the other neural networks, namely, BPNN and CFNN. In a recent study, Li et al. (2024) employed the SVM model along with three optimization algorithms to construct a discharge coefficient prediction model for the semi-circular side weir. Utilizing Sobol's method, sensitivity coefficients for dimensionless parameters to the discharge coefficient were calculated. The research underscores that the SVM coupled with genetic algorithm (GA-SVM) exhibits high prediction accuracy and generalization ability. Notably, their findings highlight the substantial impact of the ratio of the flow depth at the upstream weir crest point to the diameter on the discharge coefficient. Furthermore, in the research carried out by Nourani et al. (2023), the superior modeling capability of the GA-SVM model in predicting the discharge coefficient of circular-crested oblique weirs, compared to both multiple linear regression (MLR) and multiple nonlinear regression (MNLR), was demonstrated.

In recent years, there has been a growing trend toward utilizing kernel-based methods for predicting the discharge coefficient of various weir types (Roushangar et al. 2023c; Seyedian et al. 2023; Majedi-Asl et al. 2024; Wan et al. 2024). This rise in popularity can be attributed to the remarkable flexibility of these methods. By selecting different kernel functions, they allow for the modeling of diverse data structures and relationships without the need for explicitly defined transformations (Lee & Liu 2013). The incorporation of hyperparameters in kernel-based methods plays a crucial role in controlling model complexity and mitigating the risk of overfitting, particularly in high-dimensional spaces (Roushangar & Shahnazi 2020). Moreover, numerous kernel-based methods, including SVM, are formulated as convex optimization problems. This guarantees the existence of a global optimum and avoids issues related to local minima that commonly affect other approaches, such as neural networks (Sahraei et al. 2018).

Despite the growing application of various machine learning models for predicting the discharge coefficient of cylindrical weirs, a noticeable gap exists in the literature concerning the application and evaluation of kernel-based methods in predicting the discharge coefficient of different types of cylindrical weirs. Therefore, in the present study, an assessment has been carried out to evaluate the robustness of various kernel-based approaches in modeling the discharge coefficient of cylindrical weirs. To achieve this objective, for the first time, data gathered from two distinct types of cylindrical spillways – one with vertical support and the other featuring a 30-degree upstream ramp – were utilized to pursue the following goals:

  • A comprehensive analysis was conducted to assess the efficacy of various kernel-based methods in predicting the discharge coefficient of cylindrical weirs and determining the optimal model.

  • An investigation was carried out on the impact of various input parameters to identify the most influential factor in predicting the discharge coefficient of cylindrical weirs.

  • The effect of adding an upstream ramp to cylindrical spillways on the performance of kernel-based methods for predicting the discharge coefficient was examined.

  • Standalone kernel-based models were used to thoroughly and systematically evaluate their performance across various hyperparameter values, aiming to select optimal models based on minimal complexity.

Experimental model

The data obtained in the experimental study conducted by Chanson & Montes (1997) serve as the basis for predicting the discharge coefficient of cylindrical weirs. The experiment was conducted within a rectangular channel characterized by dimensions of 12 m in length, 0.301 m in width, and a sidewall height of 0.5 m. The channel exhibited a horizontal orientation, and both the bottom and sidewalls were constructed using perspex panels, each extending a length of 2 m. The inlet comprised a smoothly contoured three-dimensional convergent section with an elliptical shape. Positioned at a distance of 8 m downstream from the channel entrance were cylindrical weirs. During experimental trials, water discharge was quantified utilizing a 90-degree V-notch weir. It is anticipated that the percentage of error in the measurements is maintained at a level below 5%. The determination of flow depths was achieved through a pointer gauge, accurate to 0.2 mm. Numerous sets of experiments were conducted across the four flumes as detailed in Table 1. These experiments covered an extensive spectrum of flow rates and varied inflow conditions, as outlined in Table 2.

Table 1

Properties of cylindrical weir

Cylinder No.Reference radius
Ra
(m)
Remarks
0.07905 Cylinder made of PVC pipe 
0.0671 
0.05704 
0.0290 
Cylinder No.Reference radius
Ra
(m)
Remarks
0.07905 Cylinder made of PVC pipe 
0.0671 
0.05704 
0.0290 

aCurvature radius at crest.

Table 2

Overview of experimental flow conditions

GeometrySlope α (deg)qw (m2/sec)d1 (m)D (m)Inflow conditionsRemarks
Series T1A and T1B 0.011–0.074 0.193–0.362 0.154
0.204
0.254 
P/D and F/D Cylinder No. A. 
0.008–0.071 0.183–0.352 Cylinder No. B. 
0.011–0.076 0.194–0.359 Cylinder No. C. 
0.001–0.072 0.181–0.359 Cylinder No. D. 
Series T1C 0.003–0.073 0.173–0.3565 0.154
0.204
0.254 
Ramp Cylinder No. A. 
0.006–0.072 0.183–0.3545 Cylinder No. B. 
0.005–0.075 0.176–0.3535 Cylinder No. C. 
0.005–0.073 0.185–0.3495 Cylinder No. D. 
GeometrySlope α (deg)qw (m2/sec)d1 (m)D (m)Inflow conditionsRemarks
Series T1A and T1B 0.011–0.074 0.193–0.362 0.154
0.204
0.254 
P/D and F/D Cylinder No. A. 
0.008–0.071 0.183–0.352 Cylinder No. B. 
0.011–0.076 0.194–0.359 Cylinder No. C. 
0.001–0.072 0.181–0.359 Cylinder No. D. 
Series T1C 0.003–0.073 0.173–0.3565 0.154
0.204
0.254 
Ramp Cylinder No. A. 
0.006–0.072 0.183–0.3545 Cylinder No. B. 
0.005–0.075 0.176–0.3535 Cylinder No. C. 
0.005–0.073 0.185–0.3495 Cylinder No. D. 

Note: qw is the water discharge per unit width; d1 is the flow depth (m) upstream of the weir; D is weir height; P/D and F/D represent mean partially developed and fully developed inflow conditions; and Ramp denotes the upstream ramp (30 degrees).

Experimental series T1A and T1B focused on examining the discharge characteristics of cylindrical weirs within a horizontal channel, considering various support heights. These two series were conducted independently and explored inflow conditions encompassing both partially developed and fully developed states. In addition, experiment series T1C expanded on these findings by introducing the influence of a 30-degree upstream ramp. Notably, experiments in series T1C were conducted concurrently with those in series T1B. A comprehensive dataset comprising 576 data points was utilized, with 196 data points corresponding to series T1A experiments, another 196 data points associated with series T1B experiments, and the remaining 180 data points linked to series T1C experiments. Figure 1 illustrates a schematic representation of the geometric and hydraulic parameters of a cylindrical weir-gate.
Figure 1

Schematic view of a developed cylindrical weirs.

Figure 1

Schematic view of a developed cylindrical weirs.

Close modal

Dimensional analysis

The discharge capacity through a cylindrical weir under the assumption of free flow is formulated as follows:
(1)
(2)
(3)
where Cd is the discharge coefficient of cylindrical weir, g is the gravity constant, Hw is upstream total head above crest (m), H1 is the total head upstream of the weir (m), and explanations for other parameters have been previously provided. A crucial step in the modeling process is the introduction of an optimal input matrix to ensure accurate prediction performance of the employed kernel-based methods. The flow motion through a cylindrical weir can be described by a combination of hydraulic and geometrical parameters, as well as the following functional relationship (Chanson & Montes 1997):
(4)
Dimensional analysis can reduce the dimensions of the input matrix, thereby creating a lower-dimensional space with fewer studied parameters. In addition, the use of dimensionless groupings of variables allows for the further application of experimental results (Roushangar et al. 2023a). Given the independent variables R, g, and μ, the П-theorem yields a set of four dimensionless groups as follows:
(5)
(6)
(7)
(8)
where Frup represents the upstream Froude number Chanson & Montes (1997). Upon rearranging the dimensionless group, the following expression is obtained:
(9)
where Reup represents the upstream Reynolds number. Consequently, the discharge coefficient for a cylindrical weir can be determined through the following expression:
(10)
Researchers consistently strive to maintain turbulent flow conditions in their investigations, thereby mitigating the influence of viscosity (Reynolds number: Reup) and adhering to a minimum depth. With the exception of extremely low values of water head measured over the weir, the Reynolds number is observed to have negligible effects. Moreover, for practical considerations, it is also suggested to use the downstream flow depth instead of the upstream total head above the crest (Shamsi et al. 2022). Ultimately, with the implementation of these modifications, Equation (11) is presented as follows:
(11)

Data presentation

In the present study, a total of 572 datasets were collected and subsequently partitioned into two distinct segments. Specifically, 429 groups were designated as the training set, constituting 75% of the entire dataset, while the remaining 143 groups were used as the testing set, representing 25% of the overall data. The ranges of the employed nondimensional parameters utilized to predict the discharge coefficient of a cylindrical weir in this study are presented in Table 3 and Figure 2. Moreover, Figure 3 depicts the correlation matrix, revealing that the upstream Froude number emerges as a significantly correlated input parameter exerting influence on the Cd value.
Table 3

The statistical characteristics of employed nondimensional parameters

CdFrupd1/RL/R
Min 0.65 0.61 2.18 3.80 
Max 1.6 305.94 12.05 10.37 
Mean 1.19 60.39 5.73 6.42 
S.td. 0.096 79.63 2.53 2.76 
CdFrupd1/RL/R
Min 0.65 0.61 2.18 3.80 
Max 1.6 305.94 12.05 10.37 
Mean 1.19 60.39 5.73 6.42 
S.td. 0.096 79.63 2.53 2.76 
Figure 2

Histogram of employed nondimensional parameters.

Figure 2

Histogram of employed nondimensional parameters.

Close modal
Figure 3

The correlation matrix among all parameters.

Figure 3

The correlation matrix among all parameters.

Close modal

Machine learning models

Support vector machine

In the present study, kernel-based models were used for prediction purposes, as they address several key issues found in conventional methods such as artificial neural networks (ANNs), including slow convergence speed, low generalization capabilities, absence of probabilistic outputs, susceptibility to overfitting, and the tendency to arrive at local minima. SVMs stand out as the most popular kernel-based model in regression problems, leveraging their effectiveness in both classification and regression tasks. While SVMs are versatile, their strength in regression particularly shines when tackling nonlinear relationships. In regression scenarios, where the goal is to predict a continuous outcome, a nonlinear classifier often yields superior accuracy. SVM achieves this by first mapping the input data (denoted as x) onto an m-dimensional feature space through a fixed, nonlinear mapping function. Subsequently, a linear model is established within this transformed feature space (Vapnik 1995). The key distinction lies in SVM's ability to seamlessly integrate nonlinear transformations through the use of kernel functions. This approach allows SVM to capture intricate patterns and relationships within the data, enabling it to excel in regression tasks where the underlying structure is not strictly linear. In essence, SVM's adeptness at handling nonlinearity makes it a standout choice for regression challenges in diverse fields. The nonlinear model of SVM is expressed as follows:
(12)
where f(x) represents the model output, ω is the weight vector, φ(x) denotes the nonlinear function within the feature space, and b is the bias term. In this context, a decreased ‘w’ value indicates the level of flatness in Equation (12), achieved through the reduction of the Euclidean norm denoted as ‖w‖ (Smola 1996). The constrained form of the SVM regression function is expressed as follows:
(13)
where ξi and are slack variables, the parameter C in the given context influences the approximation error, weight vector norm, and penalizes errors exceeding ε. Utilizing Lagrangian multipliers on Equation (13) yields the dual Lagrangian form as follows:
(14)
Subject to the constraints as:
(15)

In the given context, ai and represent the Lagrangian multipliers, and K(x. xi) denotes the kernel function. In this research study, a robust 10-fold cross-validation framework with grid search methodology was employed to determine the optimal values for the hyperparameters of the SVM model. To thoroughly explore the hyperparameter space, a grid search approach was employed, systematically tuning the regularization parameter (C) within the range of 1–20 and the epsilon-insensitive loss function parameter (ε) within the range of 0–1. The optimization process also considered the kernel parameter, ensuring the simultaneous optimization of both C and ε based on the specified kernel parameter. In this research study, a robust 10-fold cross-validation framework with grid search methodology was employed to determine the optimal values for the hyperparameters of the SVM model. The 10-fold cross-validation algorithm involves partitioning the entire dataset into 10 folds, each representing a randomly drawn and disjoint subsample. Subsequently, SVM analysis is iteratively conducted on observations within 10-1 folds, constituting the cross-validation training sample. The outcomes of these analyses are then applied to the remaining fold, denoted as sample 10, serving as the testing sample. This fold was not used during SVM model fitting, and the error is computed as the sum-of-squared, representing how effectively the SVM model predicts observations in sample v. The results from 10 replications are averaged to generate a comprehensive model error measure, reflecting the stability of the model and its validity in predicting unseen data. To thoroughly explore the hyperparameter space, a grid search approach was employed, systematically tuning the regularization parameter (C) within the range of 1–20 and the epsilon-insensitive loss function parameter (ε) within the range of 0–1. The optimization process also considered the kernel parameter, ensuring the simultaneous optimization of both C and ε based on the specified kernel parameter.

Gaussian process regression

The Gaussian process regression (GPR) model has emerged as a powerful statistical tool in the realm of data-driven modeling. Its increasing popularity is attributed to its nonparametric nature, making it applicable for addressing a wide range of challenges in assortment and regression modeling. GPR leverages the principles of Gaussian processes, providing a flexible and adaptive approach to capturing complex relationships within data. The core objective of the GPR is to explicitly prioritize the space of functions. Through the incorporation of previous data and distribution information, the GPR technique yields a posterior distribution of functions (Williams & Rasmussen 1995). The Gaussian process, denoted as f(x), is characterized by its mean function m(x) and kernel functions, defined as follows:
(16)
(17)
where the kernel function, denoted as K(x. x′), is computed at points x and x′. The Gaussian process f(x) is then expressed as follows:
(18)
For simplification, it is a common practice to set the average function value to zero. Within the Gaussian process framework, the association between the input vector and the target is expressed as follows:
(19)

In this context, f(x) represents the desired regression function, ε denotes the Gaussian noise distribution with a zero mean, and σ2 stands for variance. Critical hyperparameters in the optimization process of the GPR encompass those linked to the kernel function. Furthermore, refining training hyperparameters, including the noise parameter, is pivotal for enhancing the overall performance of the GPR model.

Kernel-based extreme learning machine

The extreme learning machine (ELM), as introduced by Huang et al. (2006), represents an innovative machine learning model characterized by its feed-forward neural network training architecture. One of the primary features of the ELM is the random generation of key parameters, including the connection weights between the input and hidden layers. For a set of N arbitrary samples represented as (xi.yi), where xi = [xi1.xi2.….xin]TRm. yiR. The output of the ELM is characterized by the following definition:
(20)
where the variable N denotes the quantity of hidden neurons, the vector βi = [β1, β2,…, βN] represents the output weights that establish the connection between the hidden nodes and the output nodes, the weight vector wi = [w1i, w2i, …, wNi] signifies the connections between the hidden nodes and the input nodes, the vector bi = [b1, b2, …, bn] denotes the thresholds associated with the hidden nodes and the function h(x) represents a forward mapping of the hidden nodes in the context under consideration. Given that the input weight w and the threshold b of the hidden layer are randomly determined, the objective of network training is to identify the optimal output weight β. This can be achieved through the application of the least squares method:
(21)
where H+ denotes the Moore–Penrose generalized inverse of the output matrix from the hidden layer, denoted as H (Huang et al. 2006).
To mitigate the inherent randomness in ELM, enhance its generalization capability, and bolster its stability, Huang et al. (2011) expanded the scope of ELM through the integration of kernel learning, introducing the concept of kernel-based ELM. Utilizing the orthogonal projection method and ridge regression theory, the computation of the output weight β involves the addition of a positive constant 1/C, as follows:
(22)
Thus, the output function of the proposed ELM model is determined as follows:
(23)
The kernel matrix for the ELM can serve as a substitute for h(x). Subsequently, the output function of the kernel extreme learning machine (KELM) can be formulated as follows:
(24)
where K(x. xi) represents the kernel function. For the proficient application of the KELM model in regression problems, it is imperative to optimize both the constant C and the parameter of the kernel function.

Kernel ridge regression

Kernel ridge regression (KRR) emerges from the integration of kernel methods and ridge regression techniques (Zhang et al. 2013). The primary advantage of the KRR model lies in its utilization of tailored criteria and kernel methodologies to capture nonlinear relationships, effectively addressing concerns related to overfitting in regression problems. The mathematical formulation of KRR is expressed as follows:
(25)
(26)
In the aforementioned equations, stands as the Hilbert normed space, the υ represents dependent weights associated with the data, and function Φ is denoted as the kernel function corresponding to x in the context of predictors. The variables h and y represent, respectively, the response parameter and the regression function. The symbol δ represents the regularization parameter, typically set to a fixed value greater than zero. For a kernel matrix (K) with dimensions of (n × n), Equation (25) can be reformulated as follows:
(27)
During the training phase, KRR determines the value of υ through obtaining a solution for Equation (27). The algorithm seeks to identify optimal values for υ and δ by optimizing a predefined set of variables to achieve the highest performance (Ali et al. 2020). The derived υ is subsequently applied in the testing phase as outlined below:
(28)

Similarly to other kernel-based methods, the successful implementation of the KRR model depends on the optimization of regularization parameter (δ) and kernel parameter. Like other kernel-based methodologies, the effective implementation of the KRR model relies on optimizing both the regularization parameter (δ) and the kernel parameter.

Kernel function

The significance of the kernel function is pivotal in driving the performance of kernel-based models. Its primary importance lies in the inherent ability of the kernel to implicitly map input data into high-dimensional feature spaces, thereby facilitating the identification of intricate patterns and relationships that might pose challenges in the original input space (Roushangar et al. 2023b). The appropriate selection and fine-tuning of the kernel function exert a substantial influence on the capability of the model to capture nonlinear dependencies, resulting in enhanced prediction accuracy and generalization across diverse datasets. Functioning as a critical bridge, the kernel enables these models to excel in tasks characterized by diverse and complex data structures. The radial basis function (RBF) stands as a widely utilized kernel function with extensive applications across various domains, notably in regression and classification. Characterized by its capacity to capture intricate relationships within high-dimensional spaces, the RBF proves especially well-suited for applications involving kernel-based methods and tasks requiring interpolation. Its versatility and effectiveness in modeling complex relationships contribute to its prevalence and significance in diverse fields. In the present study, the RBF kernel function, renowned for its successful applications as reported in relevant studies (Komasi et al. 2018; Najafzadeh & Oliveto 2020; Amininia & Saghebian 2021; Roushangar et al. 2021), was strategically incorporated into the central core of the employed kernel-based models. The mathematical form of RBF kernel function is presented as follows:
(29)
where x and x′ are two input instances and σ is the RBF kernel parameter.

Performance criteria

The performance of employed kernel-based models used was evaluated by three widely used performance indices. The correlation coefficient (R) is a metric used to assess the accuracy of numerical predictions. The value of R ranges from −1.0 to 1.0. An R value of −1.0 signifies a perfect negative correlation between observed and predicted values, whereas an R value of 1.0 indicates a perfect positive correlation. An R value of 0.0 denotes the absence of any linear relationship between the observed and predicted values. Nash–Sutcliffe efficiency (NSE) is a crucial metric for calibrating and evaluating hydrological models using the observed data. NSE values range from −∞ to 1, where an NSE of 1 signifies a perfect alignment between modeled and observed data. Conversely, an NSE of 0 indicates that the model's predictions are no better than simply using the mean of the observed data (Nash & Sutcliffe 1970). Mean-square error (MSE) is a widely utilized statistical metric for evaluating model performance. Root-mean-square error (RMSE) is derived by taking the square root of the MSE, thus presenting the error in the same units as the predicted values. Specifically, the RMSE measures the square root of the average squared differences between the predicted and observed values. This index reflects a model's accuracy in predicting values relative to the observed data. The RMSE ranges from 0 to ∞, where an RMSE of 0 indicates a perfect model with predictions that exactly match the observed values (Kharb et al. 2021). The assessment of over/under predictions made by the employed kernel-based models was quantified using discrepancy ratio (DR) criterion. A DR value equal to 1 signifies optimal efficiency in the performance of the kernel-based technique. If DR is greater than 1, it indicates overpredictions, while a value less than one implies underestimation:
(30)
(31)
(32)
(33)
In this context, Xi and Yi represent the observed and predicted values of the discharge coefficient, respectively, while N denotes the total number of data points. In addition:
(34)
(35)
The prediction of the discharge coefficient for cylindrical weir is derived based on the three nondimensional parameters (Frup, Hw/R, L/R). In the predictive modeling process, 75% of the dataset is allocated to the training stage, while the remaining 25% is reserved for the testing stage. Descriptive statistics of the superior kernel-based performance outcomes during both the training and testing phases are presented in Table 4. Analysis of statistical indices reveals that, among various kernel-based techniques, the GPR model demonstrated the highest accuracy (R = 0.967, NSE = 0.935, and RMSE = 0.027) during the testing phase. As indicated in Table 4, the SVM model attained the second-highest accuracy (R = 0.904, NSE = 0.797, and RMSE = 0.048). Upon statistical evaluation during the testing stages, it was observed that both KELM and KRR exhibited comparable performances in predicting the discharge coefficient of the cylindrical weir. It can be seen that the GPR model demonstrated the highest efficiency with a DR of 0.999, whereas SVM predictions exhibited the least underestimation with a DR value of 0.989. The performance of employed kernel-based models for the prediction of the discharge coefficient in the testing stage is depicted in Figure 4. As depicted in Figure 4, the 95% prediction bands for the GPR model are situated lower. This observation signifies the heightened accuracy and reliability of the GPR model in predicting the discharge coefficient of the cylindrical weir, surpassing the performance of other employed kernel-based models. In other words, the reduced width of the prediction interval indicates a higher level of confidence that the observed values will be within a narrower range around the predicted values.
Table 4

The best statistical performances of the kernel-based models for prediction of discharge coefficient

ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.905 0.810 0.040 0.995 0.904 0.797 0.048 0.989 
GPR 0.981 0.964 0.017 1.000 0.967 0.935 0.027 0.999 
KELM 0.856 0.732 0.047 1.001 0.858 0.731 0.055 0.996 
KRR 0.857 0.735 0.047 1.002 0.855 0.726 0.055 0.996 
ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.905 0.810 0.040 0.995 0.904 0.797 0.048 0.989 
GPR 0.981 0.964 0.017 1.000 0.967 0.935 0.027 0.999 
KELM 0.856 0.732 0.047 1.001 0.858 0.731 0.055 0.996 
KRR 0.857 0.735 0.047 1.002 0.855 0.726 0.055 0.996 

Note: Bold values indicate the best results.

Figure 4

The scatterplots of the employed kernel-based models during the testing stage (red line represents the best fit, and green lines represents the 95% prediction intervals).

Figure 4

The scatterplots of the employed kernel-based models during the testing stage (red line represents the best fit, and green lines represents the 95% prediction intervals).

Close modal
Moreover, Figure 5 presents a comparative evaluation of kernel-based models through the utilization of Taylor diagrams. These diagrams serve as a graphical representation to depict the degree of similarity between a given pattern and observed data. Specifically, the azimuthal axis of the diagram represents the correlation coefficient, while the radial distance from the origin signifies the standard deviation. In addition, the centered root-mean-square difference (RMSD) is represented as the distance from the reference (observed) point. Points in close proximity to the reference point signify superior model performance. The diagrams illustrate that the GPR model exhibited the closest proximity to the observed point during the testing phase. Consequently, this suggests that the GPR model emerged as the most effective predictive model for predicting the discharge coefficient of a cylindrical weir. This conclusion is drawn from the GPR model's positioning near the reference point along the arc with a higher correlation coefficient (toward 1.0) and a smaller distance from the reference (smaller RMSD), indicative of enhanced performance. Furthermore, a similar level of performance is evident in the Taylor diagram for the KELM and KRR models, as they are positioned adjacent to each other and farther from the reference point.
Figure 5

A graphical Taylor diagram illustrating the comparative performance of the kernel-based models employed during the testing stage.

Figure 5

A graphical Taylor diagram illustrating the comparative performance of the kernel-based models employed during the testing stage.

Close modal
Moreover, relative error (RE), defined as the ratio of absolute error to the observed value (RE = absolute error/observed value), served as a pivotal metric for the comprehensive evaluation of the employed kernel-based models. The application of RE facilitated a visual exploration of the models' performance. To assess the robustness of these kernel-based models, the RE was systematically calculated and employed to visually present additional insights into the prediction accuracy of each data point. It can be observed from Figure 6 that the GPR exhibits effective control over the RE, limiting it to approximately 13%. In contrast, SVM, KELM, and KRR exert control over the maximum RE within broader ranges of 20, 25, and 26%, respectively. This observation underscores that the precision and reliability of SVM, KELM, and KRR are comparatively less stable than those achieved by GPR.
Figure 6

The relative error analysis of discharge coefficient predictions during the testing stage.

Figure 6

The relative error analysis of discharge coefficient predictions during the testing stage.

Close modal
Figure 7 illustrates the error distribution diagram, indicating the predicted absolute relative error (ARE) distribution in the testing stage for the employed kernel-based models. ARE is computed as the absolute value of the difference between the observed value and the predicted value, divided by the observed value (|observed value-predicted value|/observed). In the error distribution diagram, the y-axis represents cumulative frequency, while the x-axis signifies the threshold ARE. It is observed that a substantial proportion of the predicted data exhibits low ARE across various kernel-based models. Specifically, more than 86, 69, 71, and 60% of the predictions made by the GPR, KELM, KRR, and SVM models, respectively, have an ARE of less than 3%. Furthermore, the comprehensive diagram illustrates that nearly 100% of the data in the GPR, KELM, KRR, and SVM models is forecasted to have an error below 13, 25, 26, and 19%, respectively. Notably, the GPR model emerges as the most accurate in predicting discharge coefficients, demonstrating lower ARE compared to other kernel-based models. On the other hand, both the KELM and KRR models exhibit similar efficacy in terms of error distribution, while the SVM model indicates the least efficiency relative to its counterparts.
Figure 7

The cumulative frequency of absolute relative error (%) for employed kernel-based models.

Figure 7

The cumulative frequency of absolute relative error (%) for employed kernel-based models.

Close modal
A high regularization parameter in KRR modeling penalizes complexity, potentially leading to underfitting. This bias toward simplicity, crucial in the bias-variance trade-off, helps prevent overfitting but may increase bias, hindering the model's ability to capture the data's underlying complexity. Excessive regularization may also make the model less sensitive to noisy data, risking the loss of important information. Therefore, it is crucial to achieve a balance when selecting the value of regularization parameter.Figure 8 depicts the performance of the KRR model across various regularization parameters and RBF kernel parameters. Through the conducted modeling, optimal results (R = 0.855, NSE = 0.726, and RMSE = 0.055) are obtained when the regularization parameter is set to 0.001 and the RBF kernel parameter is 33. The figure illustrates a notable decline in modeling accuracy with increasing values of regularization parameter, particularly evident for δ values of 10. Hence, the optimal value for the regularization parameter in the KRR model was determined to be 0.001. This value was subsequently employed in the subsequent stages of the modeling process.
Figure 8

Performance of the KRR model across a range of regularization parameters and RBF kernel parameters.

Figure 8

Performance of the KRR model across a range of regularization parameters and RBF kernel parameters.

Close modal
Figure 9 depicts the performance of the GPR model across various noise parameters and RBF kernel parameters. The GPR model, with RBF kernel parameter set to 19 and a noise value of 0.001, exhibits superior performance according to the obtained results. Evaluating GPR efficiency through the RMSE statistical index reveals more pronounced fluctuations with changes in noise compared to the KRR model's response to regularization parameter variations. This issue is particularly noticeable at lower RBF kernel function parameter values. However, increasing the RBF kernel parameter mitigates GPR fluctuations across different noise values. Similar to the KRR model, GPR experiences reduced modeling accuracy, particularly with higher noise values such as 10. Consequently, based on the modeling outcomes, a noise value of 0.001 is identified as the optimal choice for the GPR model, applied in subsequent modeling stages.
Figure 9

Performance of the GPR model across a range of noises and RBF kernel parameters.

Figure 9

Performance of the GPR model across a range of noises and RBF kernel parameters.

Close modal
Figure 10 demonstrates the performance of the KELM across an extensive range of the RBF kernel parameters, as well as various regularization parameters. Observably, in contrast to other employed kernel-based methods, the KELM model exhibits consistent behavior across diverse values of regularization parameters. The stability of performance across different regularization values implies a reliable behavior of the KELM model in predicting the discharge coefficient of cylindrical weir.
Figure 10

Performance of the KELM model across a range of regularization parameters and RBF kernel parameters.

Figure 10

Performance of the KELM model across a range of regularization parameters and RBF kernel parameters.

Close modal
Figure 11 illustrates the graphical representation of the fluctuations in the performance of employed kernel-based models in terms of the RMSE statistical index across various RBF kernel parameter values. In general, the RBF kernel parameter holds significant importance in shaping the behavior of kernel-based models. Accurate tuning of this parameter is crucial for striking a balance between capturing local patterns and ensuring robust generalization to new data, thereby exerting a pivotal influence on the predictive performance of the model. Based on the obtained results, the SVM model, characterized by the lowest RBF kernel parameter value (γ = 19), and the KELM model, featuring the highest RBF kernel parameter value (γ = 1,600), exhibit their optimal performance. Higher values of the RBF kernel parameter make the decision boundary more intricate and potentially lead to a higher degree of complexity in the KELM model. This process enhances the KELM model's sensitivity to local variations within the data, allowing the capture of fine-grained patterns. Extremely high values of the RBF kernel parameter (greater than 1,600) lead to overfitting of the KELM model, where the model fits the training data too closely, capturing noise and potentially failing to generalize well to new data. In this context, the GPR model presents optimal performance with the lowest value of the RBF kernel parameter. Moreover, Figure 11 distinctly illustrates the severe fluctuations in the performance of the SVM model across various values of the RBF kernel parameter. These fluctuations are similarly evident in the performance of the GPR model; however, stability is achieved by incrementally increasing the value of the RBF kernel parameter in the GPR model. Both KELM and KRR models exhibit a consistent pattern in the fluctuation of the RMSE index relative to variations in the RBF kernel parameter. Notably, they demonstrate greater stability compared to SVM and GPR models. Beyond experiencing notable fluctuations, the SVM model also displays an increased susceptibility to overtraining errors. The optimal value of the RBF kernel function depends on the specific characteristics of the data. Various kernel-based models show distinct responses to changes in the RBF kernel parameter, reflecting the influence of the underlying dataset.
Figure 11

The fluctuation of RMSE statistical index across various RBF kernel parameter values (γ) within employed kernel-based models.

Figure 11

The fluctuation of RMSE statistical index across various RBF kernel parameter values (γ) within employed kernel-based models.

Close modal

Sensitivity analysis

To assess the influence of individual input parameters on the accuracy of the discharge coefficient modeling of cylindrical weirs, a simple form of sensitivity analysis was carried out. For this purpose, each input parameter was systematically excluded, and the modeling process was subsequently reiterated. By investigating the variations of statistical indexes, the impact of the omitted parameter on the modeling was determined. The results obtained from the sensitivity analysis process, represented as variations in the RMSE statistical index (ΔRMSE), are illustrated in Figure 12. Upon analyzing the obtained results, it is evident that eliminating the Frup parameter leads to a significant decline in the accuracy of GPR modeling, with a 225% increase in the RMSE statistical index. The corresponding reductions for KELM, KRR, and SVM models are 65, 62, and 91%, respectively. Consequently, it can be declared that the upstream Froude number holds a pivotal role in predicting the discharge coefficient of a cylindrical weir. Furthermore, the elimination of the d1/R parameter results in a notable 107% decline in the modeling accuracy of the GPR model. In contrast, the removal of this parameter yields a slight improvement in the modeling accuracy of both the KELM and KRR models. The noteworthy point is the significant effect of removing d1/R in the prediction capability of the SVM model, which reduces the modeling accuracy by 321%. The significant drop in accuracy can be attributed to the SVM model's inherent vulnerability to overtraining. To address the overtraining error, the SVM model employs reduced values for the RBF kernel parameter (γ). Nonetheless, this modification noticeably reduces the modeling accuracy. This suggests the critical importance of the parameter d1/R in maintaining the robustness and generalization capacity of the SVM model, and without it, the SVM model started fitting noise in the data. In the last stage, the L/R parameter exhibits the least influence on the modeling process of the discharge coefficient of a cylindrical weir. Its elimination results in a respective decrease of 24, 8, 6, and 4% in the modeling accuracy of the GPR, KELM, KRR, and SVM models, as measured by the RMSE statistical index. Generally, the varied responses exhibited by different kernel-based models upon the exclusion of a specific input parameter are observed to stem from a multitude of factors. The impact of a parameter on model accuracy is intricately influenced by its interactions with other parameters and features within the dataset. The removal of a parameter may detrimentally affect the accuracy of one method while improving that of another, thereby highlighting diverse strategies for handling feature interactions. The effect of specific parameters in modeling is contingent upon the characteristics of the dataset, encompassing variations in data distribution, noise, and the significance of individual features. Moreover, the varying degrees of complexity inherent in each kernel-based model contribute to distinct model responses when a parameter is removed, either simplifying or complicating the model based on its inherent intricacies.
Figure 12

Results of sensitivity analysis in terms of RMSE statistical index.

Figure 12

Results of sensitivity analysis in terms of RMSE statistical index.

Close modal
To conduct a comprehensive investigation of the predictive capability of kernel-based models in predicting the discharge coefficient of cylindrical weirs, an in-depth analysis was conducted using experimental data obtained from two distinct configurations of cylindrical weirs. These configurations included a cylindrical weir with vertical support and a cylindrical weir featuring a 30-degree upstream ramp. Cylindrical weirs with a 30-degree upstream ramp provide numerous advantages over traditional rectangular weirs and other hydraulic structures. Their design is crucial for accurately predicting the discharge coefficient for effective implementation. The ramp facilitates smooth flow guidance toward the weir crest, enhancing flow measurement and control efficiency by minimizing turbulence and energy losses. This design also reduces the risk of cavitation, common in sharp-edged weirs, due to its gradual flow transition. These weirs are versatile, suitable for various flow conditions and applications in both small and large hydraulic systems. They offer flexibility in design and are cost-effective to construct and maintain, promising long-term reliability and reduced maintenance needs compared to other flow control structures. The findings are presented in Tables 5 and 6. Based on the outcomes, the discharge coefficient of cylindrical weir with vertical support has notably strong predictive capability. The employed kernel-based models exhibit a high level of accuracy in this regard. Notably, among the utilized kernel-based models, GPR stands out, demonstrating exceptional accuracy when noise values are set at 0.001 and the kernel parameter is configured to 1. Contrarily, the introduction of a 30-degree upstream ramp introduces intricate hydraulic conditions that pose challenges in accurately modeling the discharge coefficient, resulting in a substantial decrease in modeling accuracy. Factors such as alterations in flow patterns, increased turbulence, and changes in the hydraulic behavior due to the ramp contribute significantly to the diminished accuracy observed in modeling discharge coefficient of cylindrical weirs with upstream 30-degree ramps. Table 6 illustrates that the KELM model, characterized by regularization parameters set at 3 and an RBF kernel parameter value of 9, outperforms other employed kernel-based methods in modeling the discharge coefficient of cylindrical weirs with a 30-degree ramp. This superior performance is evidenced by key metrics, including an R value of 0.760, NSE of 0.529, RMSE of 0.060, and DR of 1.017. It is noteworthy that the SVM model exhibits notably poor performance (R = 0.510, NSE = 0.061, RMSE = 0.085, and DR = 1.027). The suboptimal performance confirms the inherent limitation of the SVM model, indicating a lack of reliability in precisely modeling the discharge coefficient under complex hydraulic conditions. This finding adds to the array of limitations associated with the SVM model in the current study, including its susceptibility to overtraining phenomena and its susceptibility to fluctuations in response to changes in the RBF kernel parameter. Figure 13 illustrates the scatter plot between observed and predicted discharge coefficients for the best models corresponding to each type of cylindrical weir.
Table 5

Statistical performance of employed kernel-based models in predicting the discharge coefficient of a cylindrical weir with vertical support

ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.976 0.946 0.020 0.995 0.969 0.929 0.024 0.997 
GPR 0.998 0.997 0.004 1 0.997 0.994 0.007 1 
KELM 0.989 0.979 0.012 0.999 0.986 0.970 0.016 
KRR 0.996 0.992 0.007 0.994 0.988 0.009 0.999 
ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.976 0.946 0.020 0.995 0.969 0.929 0.024 0.997 
GPR 0.998 0.997 0.004 1 0.997 0.994 0.007 1 
KELM 0.989 0.979 0.012 0.999 0.986 0.970 0.016 
KRR 0.996 0.992 0.007 0.994 0.988 0.009 0.999 

Note: Bold values indicate the best results.

Table 6

Statistical performance of employed kernel-based models in predicting the discharge coefficient of a cylindrical weir with upstream 30°-u/s ramp

ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.951 0.859 0.039 0.999 0.510 0.061 0.085 1.027 
GPR 0.857 0.733 0.054 1.003 0.747 0.492 0.063 1.010 
KELM 0.771 0.590 0.067 1.003 0.760 0.529 0.060 1.017 
KRR 0.697 0.484 0.076 1.005 0.749 0.472 0.064 1.022 
ModelTraining stage
Testing stage
RNSERMSEDRRNSERMSEDR
SVM 0.951 0.859 0.039 0.999 0.510 0.061 0.085 1.027 
GPR 0.857 0.733 0.054 1.003 0.747 0.492 0.063 1.010 
KELM 0.771 0.590 0.067 1.003 0.760 0.529 0.060 1.017 
KRR 0.697 0.484 0.076 1.005 0.749 0.472 0.064 1.022 

Note: Bold values indicate the best results.

Figure 13

The scatterplots between observed and predicted Cd based on the most accurate models: (a) GPR model for cylindrical weir with vertical support and (b) cylindrical weir with upstream 30°-u/s ramp.

Figure 13

The scatterplots between observed and predicted Cd based on the most accurate models: (a) GPR model for cylindrical weir with vertical support and (b) cylindrical weir with upstream 30°-u/s ramp.

Close modal

This section presents a comparative analysis between the findings of the present study and recent research employing various machine learning techniques to predict the discharge coefficient of cylindrical hydraulic structures. Table 7 summarizes relevant works and evaluates the performance of the employed machine learning techniques using R2 and RMSE statistical indexes. Studies on different types of cylindrical hydraulic structures have demonstrated that kernel-based SVM models exhibit superior accuracy in modeling the discharge coefficient compared to neural network-based methods. The demonstrated high efficacy of kernel-based methods in modeling various types of cylindrical weirs led to their selection in this study for predicting the discharge coefficient of a novel cylindrical weir design. The results obtained demonstrate the high accuracy of GPR in predicting the discharge coefficient of cylindrical weirs with vertical support. A thorough and detailed evaluation of each kernel-based model employed revealed that GPR shows optimal performance in predicting the discharge coefficient with less complexity. In addition, for cylindrical weirs with an upstream ramp, the KELM model achieved acceptable accuracy with R2 = 0.760 and RMSE = 0.060. However, a notable limitation in studies concerning discharge coefficient prediction in cylindrical weirs is the scarcity of relevant experimental data for modeling purposes. In this study, efforts were made to enhance reliability by utilizing a substantial dataset of experimental data (576 samples) for modeling the discharge coefficient. The significance of the upstream Froude number in predicting the discharge coefficient of cylindrical weirs was noted. These findings align with the research conducted by Parsaie et al. (2017), indicating that for cylindrical weir-gates, the upstream Froude number, along with the ratio of gate opening height to cylinder diameter, are crucial factors in discharge coefficient prediction.

Table 7

Comparison of discharge coefficient modeling results for cylindrical-shaped hydraulic structures based on literature

Author(s)Number of dataType of weirCovered model typeEvaluation Criteria
R2RMSE
Parsaie et al. (2017)  89 Cylindrical weir-gate Multilayer Perceptron Neural Network (MLP) 0.993 0.017 
ANFIS 0.99 0.045 
Parsaie et al. (2018)  89 Cylindrical weir-gate PSO-GMDH 0.988 0.025 
MLPNN 0.997 0.013 
SVM 0.999 0.013 
Ismael et al. (2021)  45 Oblique cylindrical weir RBFN 0.999 0.008 
Back-Propagation Neural Network (BRNN) 0.997 0.014 
CFNN 0.997 0.009 
Nourani et al. (2023)  234 Circular-crested oblique weirs SVM 0.961 0.022 
PSO-SVM 0.959 0.024 
GA-SVM 0.992 0.009 
MLR 0.908 0.113 
MNLR 0.844 0.105 
Present study 392 Vertically supported cylindrical weirs SVM 0.969 0.024 
GPR 0.997 0.007 
KELM 0.986 0.016 
KRR 0.994 0.009 
Present study 180 Cylindrical weir with upstream 30°-u/s ramp SVM 0.510 0.085 
GPR 0.747 0.063 
KELM 0.760 0.060 
KRR 0.749 0.064 
Author(s)Number of dataType of weirCovered model typeEvaluation Criteria
R2RMSE
Parsaie et al. (2017)  89 Cylindrical weir-gate Multilayer Perceptron Neural Network (MLP) 0.993 0.017 
ANFIS 0.99 0.045 
Parsaie et al. (2018)  89 Cylindrical weir-gate PSO-GMDH 0.988 0.025 
MLPNN 0.997 0.013 
SVM 0.999 0.013 
Ismael et al. (2021)  45 Oblique cylindrical weir RBFN 0.999 0.008 
Back-Propagation Neural Network (BRNN) 0.997 0.014 
CFNN 0.997 0.009 
Nourani et al. (2023)  234 Circular-crested oblique weirs SVM 0.961 0.022 
PSO-SVM 0.959 0.024 
GA-SVM 0.992 0.009 
MLR 0.908 0.113 
MNLR 0.844 0.105 
Present study 392 Vertically supported cylindrical weirs SVM 0.969 0.024 
GPR 0.997 0.007 
KELM 0.986 0.016 
KRR 0.994 0.009 
Present study 180 Cylindrical weir with upstream 30°-u/s ramp SVM 0.510 0.085 
GPR 0.747 0.063 
KELM 0.760 0.060 
KRR 0.749 0.064 

This study aimed to evaluate the effectiveness of various kernel-based methodologies in predicting the discharge coefficient of cylindrical weirs, with a specific focus on two distinct weir configurations: one with vertical support and another featuring a 30-degree upstream ramp. The results of the investigation indicate that the GPR model outperforms other kernel-based models, demonstrating superior accuracy (R = 0.967, NSE = 0.935, and RMSE = 0.027) in modeling the overall dataset collected from both types of weirs. The GPR model demonstrated a higher degree of simplicity in modeling the targeted phenomenon, employing smaller values for the kernel parameter. Conversely, the SVM model exhibited increased susceptibility to overtraining phenomena and demonstrated heightened sensitivity to changes in the kernel parameter. Other findings arising from the present study are as follows:

  • Sensitivity analysis revealed that the upstream Froude number (Frup) plays a pivotal role in accurately predicting the discharge coefficient of cylindrical weirs. The exclusion of this parameter led to a significant decline in modeling accuracy for all models, emphasizing its critical influence.

  • Cylindrical spillways featuring vertical support demonstrate enhanced predictive capability for the discharge coefficient. In this context, the GPR model, exhibiting statistical indices of R = 0.997, NSE = 0.994, RMSE = 0.007, and DR = 1, achieves a high level of accuracy.

  • The investigation extended its analysis to the prediction of discharge coefficients for cylindrical weirs with a 30-degree upstream ramp. In this case, the modeling accuracy decreased, reflecting the challenges posed by the altered hydraulic conditions introduced by the ramp. The KELM model demonstrated superior performance in this scenario (R = 0.760, NSE = 0.529, RMSE = 0.060, and DR = 1.017), outperforming other kernel-based methods.

While the current study focused on evaluating the predictive capabilities of a standalone kernel-based model for the discharge coefficient of cylindrical weirs, future research endeavors could be expanded to incorporate various machine learning techniques through hybridized approaches to optimize the relevant hyperparameters and obtain the best performance. Furthermore, a significant constraint in modeling the discharge coefficient for various types of weirs is the limited availability of experimental data. Therefore, for validation of the outcomes derived in the present study, future research should incorporate an extensive range of experimental and field data encompassing various hydraulic conditions.

All relevant data are available from an online repository at https://doi.org/10.1061/(ASCE)0733-9437(1998)124:3(152).

The authors declare there is no conflict.

Abed
M.
,
Imteaz
M. A.
&
Ahmed
A. N.
2023
A comprehensive review of artificial intelligence-based methods for predicting pan evaporation rate
.
Artificial Intelligence Review
56
(
Suppl 2
),
2861
2892
.
Adnan
R. M.
,
Mostafa
R. R.
,
Islam
A. R. M. T.
,
Kisi
O.
,
Kuriqi
A.
&
Heddam
S.
2021
Estimating reference evapotranspiration using hybrid adaptive fuzzy inferencing coupled with heuristic algorithms
.
Computers and Electronics in Agriculture
191
,
106541
.
Adnan
R. M.
,
Dai
H. L.
,
Mostafa
R. R.
,
Parmar
K. S.
,
Heddam
S.
&
Kisi
O.
2022
Modeling multistep ahead dissolved oxygen concentration using improved support vector machines by a hybrid metaheuristic algorithm
.
Sustainability
14
(
6
),
3470
.
Adnan
R. M.
,
Dai
H. L.
,
Mostafa
R. R.
,
Islam
A. R. M. T.
,
Kisi
O.
,
Heddam
S.
&
Zounemat-Kermani
M.
2023a
Modelling groundwater level fluctuations by ELM merged advanced metaheuristic algorithms using hydroclimatic data
.
Geocarto International
38
(
1
),
2158951
.
Adnan
R. M.
,
Mostafa
R. R.
,
Dai
H. L.
,
Heddam
S.
,
Kuriqi
A.
&
Kisi
O.
2023b
Pan evaporation estimation by relevance vector machine tuned with new metaheuristic algorithms using limited climatic data
.
Engineering Applications of Computational Fluid Mechanics
17
(
1
),
2192258
.
Afaridegan
E.
,
Amanian
N.
,
Parsaie
A.
&
Gharehbaghi
A.
2023
Hydraulic investigation of modified semi-cylindrical weirs
.
Flow Measurement and Instrumentation
93,
102405
.
AL-Dabbagh
M. A. S.
,
Shekho
A. A.
,
Yuce
M. I.
&
Abdulrazzaq
D. G.
2023
Effects of upstream and downstream ramp on flow characteristics over a cylindrical weir
.
International Journal of Applied Science and Engineering
20
(
2
),
1
10
.
Chanson
H.
&
Montes
J. S.
1997
Overflow Characteristics of Cylindrical Weirs
.
Research Report CE154. Dept. of Civil Engineering, University of Queensland, Brisbane, Australia
.
Chanson
H.
&
Montes
J. S.
1998
Overflow characteristics of circular weirs: Effects of inflow conditions
.
Journal of Irrigation and Drainage Engineering
124
(
3
),
152
162
.
Corzo Perez
G. A.
&
Solomatine
D. P.
2024
Hydroinformatics and applications of artificial intelligence and machine learning in water related problems. In: Advanced Hydroinformatics: Machine Learning and Optimization for Water Resources (G. A. Corzo Perez & D. P. Solomatine, eds.), John Wiley, Hoboken, NJ, USA, pp. 1–38
.
Deng
Y.
,
Zhang
D.
,
Zhang
D.
,
Wu
J.
&
Liu
Y.
2023
A hybrid ensemble machine learning model for discharge coefficient prediction of side orifices with different shapes
.
Flow Measurement and Instrumentation
91
,
102372
.
Escande
L.
&
Sananes
F.
1959
Etude des seuils déversants à fente aspiratrice
.
La Houille Blanche
14 (B),
892
902
.
(In French)
.
Gholami
A.
,
Akbar Akhtari
A.
,
Minatour
Y.
,
Bonakdari
H.
&
Javadi
A. A.
2014
Experimental and numerical study on velocity fields and water surface profile in a strongly-curved 90 open channel bend
.
Engineering Applications of Computational Fluid Mechanics
8
(
3
),
447
461
.
Haghiabi
A. H.
,
Mohammadzadeh-Habili
J.
&
Parsaie
A.
2018
Development of an evaluation method for velocity distribution over cylindrical weirs using doublet concept
.
Flow Measurement and Instrumentation
61
,
79
83
.
Heidarpour
M.
&
Chamani
M. R.
2006
Velocity distribution over cylindrical weirs
.
Journal of Hydraulic Research
44
(
5
),
708
711
.
Huang
G. B.
,
Zhu
Q. Y.
&
Siew
C. K.
2006
Extreme learning machine: Theory and applications
.
Neurocomputing
70
(
1–3
),
489
501
.
Huang
G. B.
,
Zhou
H.
,
Ding
X.
&
Zhang
R.
2011
Extreme learning machine for regression and multiclass classification
.
IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
42
(
2
),
513
529
.
Ismael
A. A.
,
Suleiman
S. J.
,
Al-Nima
R. R. O.
&
Al-Ansari
N.
2021
Predicting the discharge coefficient of oblique cylindrical weir using neural network techniques
.
Arabian Journal of Geosciences
14
,
1
8
.
Jaeger
C.
1956
Engineering Fluid Mechanics (No. 627 J34)
.
Blackle and Son, Glasgow, UK
.
Kabiri-Samani
A.
&
Bagheri
S.
2014
Discharge coefficient of circular-crested weirs based on a combination of flow around a cylinder and circulation
.
Journal of Irrigation and Drainage Engineering
140
(
5
),
04014010
.
Kharb
S. S.
,
Antil
P.
,
Singh
S.
,
Antil
S. K.
,
Sihag
P.
&
Kumar
A.
2021
Machine learning-based erosion behavior of silicon carbide reinforced polymer composites
.
Silicon
13
,
1113
1119
.
Koch
A.
,
Carstanjen
M.
&
Hainz
L.
1926
Von der Bewegung des Wassers und den Dabei Auftretenden Kräften: Grundlagen zu Einer Praktischen Hydrodynamik für Bauingenieure
.
Springer, Berlin
.
Lee
M. H.
&
Liu
Y.
2013
Kernel continuum regression
.
Computational Statistics & Data Analysis
68
,
190
201
.
Li
S.
,
Shen
G.
,
Parsaie
A.
,
Li
G.
&
Cao
D.
2024
Discharge modeling and characteristic analysis of semi-circular side weir based on the soft computing method
.
Journal of Hydroinformatics
26 (1), 175–188
.
Majedi-Asl
M.
,
Ghaderi
A.
,
Kouhdaragh
M.
&
Alavian
T. O.
2024
A performance comparison of the meta model methods for discharge coefficient prediction of labyrinth weirs
.
Flow Measurement and Instrumentation
96
,
102563
.
Matthew
G. D.
1962
The Influence of Curvature, Surface Tension and Viscosity on Flow Over Round-Crested Weirs
.
Doctoral dissertation
,
University of Aberdeen
,
Aberdeen, UK
.
Mostafa
R. R.
,
Kisi
O.
,
Adnan
R. M.
,
Sadeghifar
T.
&
Kuriqi
A.
2023
Modeling potential evapotranspiration by improved machine learning methods using limited climatic data
.
Water
15
(
3
),
486
.
Najafzadeh
M.
&
Oliveto
G.
2020
Riprap incipient motion for overtopping flows with machine learning models
.
Journal of Hydroinformatics
22
(
4
),
749
767
.
Nourani
B.
,
Arvanaghi
H.
,
Pourhosseini
F. A.
,
Javidnia
M.
&
Abraham
J.
2023
Enhanced support vector machine with particle swarm optimization and genetic algorithm for estimating discharge coefficients of circular-crested oblique weirs
.
Iranian Journal of Science and Technology, Transactions of Civil Engineering
47, 3185–3198.
Parsaie
A.
,
Haghiabi
A. H.
,
Saneie
M.
&
Torabi
H.
2017
Predication of discharge coefficient of cylindrical weir-gate using adaptive neuro fuzzy inference systems (ANFIS)
.
Frontiers of Structural and Civil Engineering
11
,
111
122
.
Parsaie
A.
,
Azamathulla
H. M.
&
Haghiabi
A. H.
2018
Prediction of discharge coefficient of cylindrical weir–gate using GMDH-PSO
.
ISH Journal of Hydraulic Engineering
24
(
2
),
116
123
.
Ramamurthy
A. S.
,
Vo
N. D.
&
Balachandar
R.
1994
A Note on Irrotational Curvilinear Flow Past A Weir
.
Journal of Fluids Engineering
116, 378–381.
Rehbock
T.
1929
The River-Hydraulic Laboratory of the Technical University at Karlsruhe
.
American Society of Mechanical Engineers, New York
.
Roushangar
K.
&
Shahnazi
S.
2019
Bed load prediction in gravel-bed rivers using wavelet kernel extreme learning machine and meta-heuristic methods
.
International Journal of Environmental Science and Technology
16
,
8197
8208
.
Roushangar
K.
&
Shahnazi
S.
2020
Prediction of sediment transport rates in gravel-bed rivers using Gaussian process regression
.
Journal of Hydroinformatics
22
(
2
),
249
262
.
Roushangar
K.
,
Majedi Asl
M.
&
Shahnazi
S.
2021
Hydraulic performance of PK weirs based on experimental study and kernel-based modeling
.
Water Resources Management
35
,
3571
3592
.
Roushangar
K.
,
Alirezazadeh Sadaghiani
A.
&
Shahnazi
S.
2023a
Novel application of robust GWO-KELM model in predicting discharge coefficient of radial gates: A field data-based analysis
.
Journal of Hydroinformatics
25
(
2
),
275
299
.
Roushangar
K.
,
Ghasempour
R.
&
Shahnazi
S.
2023b
Kernel-based modeling
. In:
Handbook of Hydroinformatics
.
Elsevier, Amsterdam, The Netherlands
, pp.
267
281
.
Sahraei
S.
,
Alizadeh
M. R.
,
Talebbeydokhti
N.
&
Dehghani
M.
2018
Bed material load estimation in channels using machine learning and meta-heuristic methods
.
Journal of Hydroinformatics
20
(
1
),
100
116
.
Sarginson
E. J.
1972
The influence of surface tension on weir flow
.
Journal of Hydraulic Research
10
(
4
),
431
446
.
Seyedian
S. M.
,
Haghiabi
A.
&
Parsaie
A.
2023
Reliable prediction of the discharge coefficient of triangular labyrinth weir based on soft computing techniques
.
Flow Measurement and Instrumentation
92
,
102403
.
Seyedzadeh
A.
,
Maroufpoor
S.
,
Maroufpoor
E.
,
Shiri
J.
,
Bozorg-Haddad
O.
&
Gavazi
F.
2020
Artificial intelligence approach to estimate discharge of drip tape irrigation based on temperature and pressure
.
Agricultural Water Management
228
,
105905
.
Shamsi
Z.
,
Parsaie
A.
&
Haghiabi
A. H.
2022
Optimum hydraulic design of cylindrical weirs
.
ISH Journal of Hydraulic Engineering
28
(
sup1
),
86
90
.
Smola
A. J.
1996
Regression Estimation with Support Vector Learning Machines
.
Doctoral dissertation, Master's thesis
,
Technische Universität München, Munich, Germany
.
Vapnik
V.
1995
The Nature of Statistical Learning Theory
.
Springer Science & Business Media, New York, USA
.
Wan
W.
,
Shen
G.
,
Li
S.
,
Parsaie
A.
,
Wang
Y.
&
Zhou
Y.
2024
Analysis of discharge characteristics of a symmetrical stepped labyrinth side weir based on global sensitivity
.
Journal of Hydroinformatics
26
(
1
),
337
349
.
Williams
C.
&
Rasmussen
C.
1995
Gaussian processes for regression
.
Advances in Neural Information Processing Systems
8
, 514–520.
Yuce
M. I.
,
Al-Babely
A. A.
&
Al-Dabbagh
M. A.
2015
Flow simulation over oblique cylindrical weirs
.
Canadian Journal of Civil Engineering
42
(
6
),
389
407
.
Zhang
Y.
,
Duchi
J.
&
Wainwright
M.
2013
Divide and conquer kernel ridge regression
. In:
Conference on Learning Theory
.
PMLR
, pp.
592
617
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).