This study investigates the discharge coefficient (Cd) of labyrinth sluice gates, a modern gate design with complex flow characteristics. To accurately estimate Cd, regression techniques (linear regression and stepwise polynomial regression) and machine learning methods (gene expression programming (GEP), decision table, KStar, and M5Prime) were employed. A dataset of 187 experimental results, incorporating dimensionless variables of internal angle (θ), cycle number (N), and water depth contraction ratio (H/G), was used to train and evaluate the models. The results demonstrate the superiority of GEP in predicting Cd, achieving a coefficient of determination (R2) of 97.07% and a mean absolute percentage error of 2.87%. To assess the relative importance of each variable, a sensitivity analysis was conducted. The results revealed that the H/G has the most significant impact on Cd, followed by the internal head angle (θ). The cycle number (N) was found to have a relatively insignificant effect. These findings offer valuable insights into the design and operation of labyrinth sluice gates, contributing to improved water resource management and flood control.

  • The discharge coefficient of labyrinth sluice gates was estimated.

  • Regression techniques (linear regression and stepwise polynomial regression) and machine learning methods (gene expression programming [GEP], decision table, KStar, and M5Prime) were employed.

  • The results demonstrate the superiority of GEP in predicting Cd, achieving a coefficient of determination (R2) of 97.07% and a mean absolute percentage error of 2.87%.

A sluice gate is an essential hydraulic structure due to its multiple functions, such as flow measurement, flood control, irrigation and industrial distribution, and navigation management. It is vital in various facilities, including canals, rivers, dams, and wastewater treatment. Furthermore, it is decisive because it can accurately control water flow in different conditions. The planar wall path formed against the flow direction is a standard configuration of traditional sluice gates. However, various attempts to optimize their design performance are based on the hydraulic characteristics investigations of sluice gates having diverse configurations at different flow conditions (Shivapur et al. 2005; Mohammed & Khaleel 2013; Mansoor 2014; Salmasi & Abraham 2020a; Wang & Diao 2021; Daneshfaraz et al. 2022; Abbaszadeh et al. 2023). The discharge coefficient must be determined to assess such structures' hydraulic performance. The discharge coefficient is a crucial and indispensable indicator and is considered an equivalent criterion on which optimization is based.

In recent years, a novel design known as the labyrinth shape was adopted for various hydraulic structures such as weirs, spillways, and sluice gates by suggesting a longer flow path than the traditional form; thereby, the discharge coefficient, discharge capacity, and energy dissipation will increase. The labyrinth design provides valuable benefits (Sadeq Maatooq & Yaseen Ojaimi 2014; Alfatlawi et al. 2023; Daneshfaraz et al. 2023; Hashem et al. 2024). As mentioned earlier, the discharge coefficient is crucial in optimizing the hydraulic design of hydraulic structures and must be estimated precisely, particularly for complex configurations and labyrinth forms. Conventional methods, such as theoretical and experimental techniques, have limited applicability due to the efforts involved, are time-consuming, and are costly (Silva & Rijo 2017; Lauria et al. 2020; Salmasi & Abraham 2020a; Steppert et al. 2021; Wang & Diao 2021; Salmasi et al. 2022; YoosefDoost & Lubitz 2022; Abbaszadeh et al. 2023; Hashem et al. 2024). Moreover, these techniques may make it difficult to capture details of intricate flow patterns and the distinct hydraulic behaviors displayed by structures, especially those with complex configurations.

Recently, machine learning (ML) approaches have detonated in various fields, including hydraulic engineering, offering thumbing opportunities to solve complex problems owing to their innate capacity for data training and understanding interaction patterns. Consequently, complex and nonlinear systems become accessible to simulate based on the efficiency of ML tools. ML contributes a promising way to build precise and supple models that can estimate discharge coefficients and other main hydraulic characteristics (Silva & Rijo 2017; Lauria et al. 2020; Salmasi & Abraham 2020a; Steppert et al. 2021; Wang & Diao 2021; Salmasi & Abraham 2022; YoosefDoost & Lubitz 2022; Abbaszadeh et al. 2023; Fatehi-Nobarian et al. 2023; Mohammed & Sharifi 2023; Fatehi-Nobarian & Moradinia 2024; Hashem et al. 2024; Mohammed & Sihag 2024).

Various ML techniques have been actively investigated in recent years based on previous studies employing genetic programming for expert systems to predict the coefficient of discharge and other hydraulic properties of sluice gates (Salmasi & Abraham 2020b). Many ML algorithms, including gradient boosting machine, random forest, generalized linear models, generalized regression neural network (GRNN), Gaussian process regression, and random tree, have been studied by researchers (Ghorbani et al. 2020; Salmasi et al. 2020). For instance, Naisheng et al. (2021) analyzed ice entrainment through a slide gate using a stacking ensemble model, which combined a support vector machine for classification and the principal component analysis method for dimensionality reduction. Using a long short-term memory (LSTM) neural network, Ho et al. (2022) indicated that their method outperformed more basic classifiers by carefully choosing pertinent input features and discarding unnecessary data. They reported that, in order to maximize the operation efficiency of sluice gates, the forecasted water levels over several time steps based on LSTM models in the short term were also investigated by their work, and accurate outcomes were obtained. Yan et al. (2023) reported that convolutional neural networks (CNNs) outperform other conventional techniques, such as genetic programming and neuro-fuzzy interference systems, achieving coefficient determination R2 up to 90% with minimized computational expense. Further studies have focused on ML model optimization. A backpropagation (BP) neural network created a flow prediction model for a measurement and control gate is adopted by Zheng et al. (2023). Compared to conventional techniques and unoptimized BP models, they improved prediction accuracy and error distribution by incorporating optimization algorithms such as particle swarm optimization and genetic algorithm.

The latest, several combinations of ML techniques have been developed; for instance, to estimate the coefficient of discharge concerning labyrinth weirs, a hybrid model combines eXtreme Gradient Boosting and a novel optimization algorithm (Linear Population Size Reduction Success-History-Based Adaptive Differential Evolution (LSHADE)) was employed (Emami et al. 2023). They demonstrated that the merged model outperforms other techniques such as artificial neural networks, gene expression programming (GEP), adaptive neuro-fuzzy inference system with firefly algorithm, and self-adaptive evolutionary extreme learning machine with the possibility of integrating various ML techniques to improve accuracy in applications concerning hydraulic engineering.

This study is concerned with obtaining the critical indicator known as the coefficient of discharge of a labyrinth sluice gate based on various ML techniques and regression models. The main objective was to introduce a reliable model that could accurately capture the discharge coefficient for this nonstandard structure. To this aim, various statistical metrics were used to thoroughly assess each ML model's performance.

Experimental model

In this context, laboratory data from Hashem et al. (2024) were used for further investigations. A laboratory flume has a 10 m length and rectangular cross-section (0.3 m in width and 0.5 m in height) (Table 1). The flume bed material is metal, whereas the side walls are glass. The labyrinth sluice gate model was positioned 5.5 m downstream from the flume inlet (Figure 1). A suite of instruments was employed to measure the flow characteristics precisely. The incoming flow rate was quantified using an ultrasonic flow meter previously calibrated against a standard weir located downstream of the main channel. Additionally, a point gauge with a precision of ±0.1 mm was used to measure flow depth at specific locations. An ultrasonic level meter was strategically positioned for upstream flow depth measurements to minimize the influence of backwater effects. A total of 187 experiments were conducted under free-flow conditions. These experiments systematically varied the number of labyrinth cycles, internal angles of the labyrinth passages, opening height of the sluice gate, and the range of the upstream water head. Figure 1 illustrates the data characterization for the subcritical flow regime in which the experiment was conducted.
Table 1

The physical characteristics of the channel and models

Channel length (m)Channel width (B) (m)Channel height (m)Internal angle (θ) (degree)US water head H (cm)Gate opening height G (cm)
10 0.3 0.5 45°–90° 7.2–45 2–5 
Sluice gate discharge coefficient Cd Number of labyrinth cycles N Projection length of the labyrinth section l Discharge rate Q (L3/T) Total folded length L (m)  
0.547–1.343 1–2 0.3–0.7839 5.7–26.1 0.3  
Channel length (m)Channel width (B) (m)Channel height (m)Internal angle (θ) (degree)US water head H (cm)Gate opening height G (cm)
10 0.3 0.5 45°–90° 7.2–45 2–5 
Sluice gate discharge coefficient Cd Number of labyrinth cycles N Projection length of the labyrinth section l Discharge rate Q (L3/T) Total folded length L (m)  
0.547–1.343 1–2 0.3–0.7839 5.7–26.1 0.3  
Figure 1

Labyrinth sluice gate. (a) Layout plan, (b) side view, (c) experimental flow regime diagram (Hashem et al. 2024).

Figure 1

Labyrinth sluice gate. (a) Layout plan, (b) side view, (c) experimental flow regime diagram (Hashem et al. 2024).

Close modal

Dimensionless parameters

Various factors interact to determine the flow characteristics beneath a sluice gate, measured by discharge rate Q (L3/T) and flow velocity V (L/T). The total folded length L (L), the number of dimensionless labyrinth cycles N, the projection length of the labyrinth section l (L), and the opening height G (L) are among the key characteristics of the gate's geometry. Further factors are the fluid properties, including the water density ρ (M/L3) and the upstream water conditions, which are represented by the upstream water head H (L). The effect of combined parameters to ensure an accurate prediction and control flow beneath sluice gates must be taken into account.

The conventional weir equation was employed for discharge calculation underneath the labyrinth sluice gate as follows:
(1)

Cd is the labyrinth sluice gate coefficient of discharge, g is the gravity acceleration (LS−2), and others were indicated initially.

The dimensional analysis indicates, in accordance with the Buckingham pi-theorem, that:
(2)
Equation (2) presents various dimensionless numbers derived based on dimensional analysis, where the first dimensionless parameter H/G represents the ratio of water depth contraction, the second parameter L/l describes the effect of the internal angle (θ) of the labyrinth configuration in terms of the total length of the labyrinth wall path to the projection length for a particular angle and N is the cycle number as previously indicated. The Reynolds number (Re) beside Weber number (We) are both discarded due to their insignificant effect(Re > 2,000 and We >50); thereby, their impact is negligible. Consequently, Equation (2) can be reduced to the following form:
(3)

Data collection

Data from 187 runs were collected based on experimental work performed by Hashem et al. (2024). The database was assembled and categorized to create a robust model. The database involved geometric and flow properties, including internal (θ) angle, gate opening (G), cycle number (N), and upstream water depth (H). A statistical analysis was done to recognize data properties. This analysis aimed to identify vital statistical parameters for each variable pertaining to the maximum, minimum, average, and standard deviation values. The analysis results are shown in Table 2. The distribution of the output and input parameters is also illustrated graphically in Figure 2.
Table 2

Fundamental statistical indicators of the collected database

NO.IndicatorAngle (θ)NH/GCd
1. Maximum value 1.57 22.5 1.343764 
2. Minimum value 0.785 2.075 0.547681 
3. Average 1.081649 1.481283 7.060918 0.803087 
4. Standard deviation 0.31623 0.500991 4.61159 0.176994 
NO.IndicatorAngle (θ)NH/GCd
1. Maximum value 1.57 22.5 1.343764 
2. Minimum value 0.785 2.075 0.547681 
3. Average 1.081649 1.481283 7.060918 0.803087 
4. Standard deviation 0.31623 0.500991 4.61159 0.176994 
Figure 2

Histogram graph presenting the distribution of the dataset.

Figure 2

Histogram graph presenting the distribution of the dataset.

Close modal

Review of regression methods and ML techniques

This study investigated several regression and ML techniques to find the best model for predicting Cd. The techniques used were linear regression (LR), Symbolic polynomial regression (SPR), GEP, KStar (K*), M5Prime (M5P), and decision table (DT), as indicated in Table 3. Three input features were used for the analysis, as described earlier.

Table 3

Regression and ML techniques

CategoriesMethodAbbreviation
Regression Linear regression LR 
Regression Symbolic polynomial regression SPR 
Gene algorithms Gene expression programming GEP 
Rules algorithms Decision table DT 
Lazy algorithms KStar K* 
Trees algorithms M5Prime M5P 
CategoriesMethodAbbreviation
Regression Linear regression LR 
Regression Symbolic polynomial regression SPR 
Gene algorithms Gene expression programming GEP 
Rules algorithms Decision table DT 
Lazy algorithms KStar K* 
Trees algorithms M5Prime M5P 

LR model

In the context of this study, the LR approach involves creating a mathematical equation to model Cd. This equation captures the correlation between several inputs (independent variables) and the desired outcome (dependent variable) using a linear form Harith et al. (2024a). The independent variables considered in this research are θ, N, H/G, while the dependent variable is (Cd). A typical LR equation for a single independent variable can be expressed as :
(4)
where y is the forecasted value of the dependent variable, b0 denotes the constant term, and b1 stands for the coefficient of the independent variable. Since in the present study, multiple independent variables were used, Equation (4) becomes:
(5)
where b0 is the y-intercept, b1, b2, and b3 are the coefficients for the independent variables, LR aims to find an equation's best coefficients (b0,b1, b2, and b3). These coefficients minimize the difference between predicted Cd values and actual Cd measurements.

Stepwise polynomial regression model

This method, called Symbolic polynomial regression (SPR), builds equations to describe relationships between data points. It starts simple and then refines the equation by adding or removing factors based on how much they improve the fit. This step-by-step process helps identify the key factors that most influence the outcome. There are different approaches within SPR, but popular ones involve adding or removing one factor at a time, either focusing on adding important ones first (forward selection) or removing unimportant ones first (backward removal). Some approaches even combine these techniques (Flom & Cassell 2007); while forward selection and backward elimination are common in SPR, they have drawbacks. They do not consider how adding or removing one factor might affect other factors already included. For example, a seemingly important factor chosen first in forward selection might become less significant as more factors are added.

Similarly, a factor removed early in backward elimination might be relevant when others are excluded. To address this limitation, SPR utilizes a special forward selection process. At each step, it checks the significance of all previously chosen factors. If any no longer meet certain criteria, the method switches to backward elimination, removing variables one by one until all remaining ones are statistically significant. Then, it switches back to forward selection to find potentially important factors again. This back-and-forth approach helps SPR build a more robust model by constantly reevaluating the influence of each factor in the presence of others (Neter 1983).

In the SPR process, a second-order polynomial model, consisting of three independent variables and one dependent variable (y) (Neter 1983), is utilized.
(6)
where y is the predicted value of the dependent variable, b0 denotes the constant term, and bi stands for the coefficient of the independent variable x.

GEP model

GEP, introduced by Ferreira as an improvement on genetic algorithms, offers a faster route to finding solutions (de Almeida Peres et al. 2011). Unlike older methods, GEP demonstrably converges on solutions quicker in experiments. This transparency is a key advantage, as GEP directly manipulates the program's building blocks (chromosomes). GEP works iteratively: first, generating random program instructions (chromosomes). These instructions are then converted into complex tree structures representing the actual programs. Each program is evaluated for its performance (fitness), and successful ones are chosen to create new variations (offspring) with potentially improved performance through genetic operations. This cycle continues for several generations or until a program with the desired outcome is found (Ferreira 2004).

GEP uses a unique approach where individuals are encoded as fixed-length linear strings that are then transformed into flexible, nonlinear expression trees of varying shapes and sizes (as shown in Figure 3). These expression trees represent the actual program and are used to select new individuals. During reproduction, the original linear strings (chromosomes) are modified by genetic operators like replication, mutation, and rearranging segments. GEP utilizes a two-part language system called Karva, with separate languages for genes and expression trees. The explicit rules governing these trees allow for direct translation between the chromosome sequence and the resulting program, making GEP a highly transparent method (de Almeida Peres et al. 2011).
Figure 3

Expression tree for GEP model.

Figure 3

Expression tree for GEP model.

Close modal

M5Prime (M5P) (M5 prime decision tree regression algorithm)

M5P trees, developed by Quinlan (1992), offer a unique approach to regression problems. They build a decision tree where each branch analyzes a specific data segment using a simple LR model. Unlike traditional methods that require pre-dividing data, M5P trees can handle continuous values and complex relationships within the data itself. This allows them to discover even subtle patterns. M5P employs a two-step process to prevent overly complex trees: first, building a giant tree and then pruning it by replacing sections with simpler LR models. M5P trees are, therefore, an effective tool for handling regression tasks.

KStar (K*) (instance-based classifier algorithm)

Traditional instance-based classifiers have limitations that are overcome by the K* method when used for regression tasks (Steppert et al. 2021). It determines the degree of similarity between data points by utilizing a special metric based on entropy, which expresses the likelihood of changing one point into another. This method handles symbolic data, real numbers, and even missing values – all frequent problems in practical applications – well. K* provides a theoretically sound and consistent data analysis method, making it a valuable tool for regression problems, in contrast to other methods that struggle with these complexities (Painuli et al. 2014).

Decision table

In order to determine the most likely class for newly collected data, it describes how DTs produce labeled training examples. The prelabelled examples with their corresponding labels and the conditions that define the classification criteria are essential components of a DT. Based on these criteria, the DT looks for an exact match when classifying a new, unlabeled instance. The label of the corresponding instance is assigned if a match is found. In the event that no match is discovered, the DT designates the table's most prevalent class. The essential idea of error estimation is that DTs use labeled examples and predefined conditions to predict labels for new data, even though the mathematical concepts involved are significant (Ayati et al. 2019).

In summary, the choice of ML methods GEP, DT, K*, and M5P for evaluating the coefficient of discharge of a labyrinth sluice gate was likely due to their ability to handle complex relationships, noise, and the need for interpretable models. These methods were selected based on their specific characteristics and advantages, making them suitable for the given application.

Statistical assessment

To see how good current models are at making predictions, researchers use a bunch of different tests. These tests, like mean absolute error, relative absolute error, relative root square error, mean square error, and root mean square error (MAE, RAE, RRSE, MSE, and RMSE), respectively, look at how close a model's guesses are to what happens (the lower the score, the better). Other tests, like R2 and R-Pearson, see how well a model's predictions match real-world trends. Using Equations (7)–(13), researchers can get a complete depiction of how good existing models are at making accurate predictions.
(7)
(8)
where n refers to the total number of inputs, xi, yi, and m represent the computed, target, and average values.

Derivative LR and SPR models

To build the LR and SPR models and evaluate their ability to generalize, the experimental data database is split into two sets: one for training and one for validation. The data for each set is chosen randomly, with 50% of the experimental data allocated to each. The LR modeling process started with a multiple linear model, as presented in Equation (5). The regression model that aligns with the response is detailed in Equation (14).
(9)
where Cd is the coefficient of discharge, θ is the internal angle of the sluice gate, N denotes the cycle number, and H/G is the depth contraction ratio. Equation (14) underscores a pronounced influence of all considered variables on the coefficient of discharge. As illustrated in Table 4, the linear terms of all the studied variables were significant for predicting the discharge coefficient, with p-values less than 0.05% (Harith et al. 2022).
Table 4

ANOVA results of linear regression

No.TermCoefficientp-value
1. Constant 0.9776 
2. Θ −0.2977 
3. N −0.0368 0.025 
4. H/G 0.02841 
No.TermCoefficientp-value
1. Constant 0.9776 
2. Θ −0.2977 
3. N −0.0368 0.025 
4. H/G 0.02841 
Stepwise polynomial regression was utilized to construct the model, beginning with a second-order polynomial model as depicted in Equation (15). The model underwent iterative refinement, excluding terms with statistically insignificant effects (p-value > 0.05). This process continued until only terms with significant statistical impact (p-value < 0.05) remained. Equation (15) encapsulates the most streamlined model identified through this stepwise selection, incorporating eight independent variables. Ultimately, Equation (15) presents the final regression model that proficiently delineates the relationship among the variables.
(10)

Equation (15) establishes a nonlinear relationship to describe the response variable concerning the three investigated factors. The mathematical framework presented in Equation (15) evinces a substantial influence of a majority of the investigated variables on the coefficient of discharge, Cd. As shown in Table 5, the linear terms of and H/G significantly affect Cd, with p-values less than 0.05%, whereas the variable N has an insignificant effect on Cd, indicated by a p-value greater than 0.05%. Moreover, the quadratic terms for and H/G also show a statistically significant impact on Cd, suggesting these variables alone substantially influence the response. Additionally, the interaction between and H/G is statistically significant for Cd, while the interaction between N and H/G is not, as indicated by a p-value greater than 0.05%.

Table 5

ANOVA results of stepwise polynomial regression

No.TermsCoefficientp-value
1.  Constant 0.8361 0.000 
2. Linear Θ −0.543 0.000 
3. N −0.0155 0.295 
4. H/G 0.10224 0.000 
5. Square θ × θ 0.2033 0.001 
6. H/G × H/G −0.001669 0.000 
7. Interactions θ × H/G −0.03884 0.000 
8.  N × H/G −0.00271 0.136 
No.TermsCoefficientp-value
1.  Constant 0.8361 0.000 
2. Linear Θ −0.543 0.000 
3. N −0.0155 0.295 
4. H/G 0.10224 0.000 
5. Square θ × θ 0.2033 0.001 
6. H/G × H/G −0.001669 0.000 
7. Interactions θ × H/G −0.03884 0.000 
8.  N × H/G −0.00271 0.136 

Note. Bold value indicates insignificant effect.

Derivative of the GEP model

To construct the GEP model and assess its generalization capability, the experimental data database is divided into two sets: the training set and the validation set. The training and testing processes randomly select 50% of the experimental data for each set. Regarding the GEP formulations, the first step is to select the fitness function. For this particular problem, the fitness () of an individual program is initially measured by:
(11)

The target value for fitness case j is Tj, the range of selection is M, Ct is the total number of fitness cases, and the value returned by chromosome i for fitness case j is and the precision is a zero value if (the precision) is less than or equal to 0.01. Note that the system is capable of determining its ideal solution when given this type of fitness function (Ciftci et al. 2009). The second crucial step involves selecting the set of terminals (T) and the set of functions (F) to construct the chromosomes. The terminal set comprises the independent variables, namely T = {θ}, T = {N}, T = {H/G} and T = {θ, N, H/G}. On the other hand, ascertaining the optimal function set presents a more intricate challenge. However, a well-founded estimation can be employed to encompass all essential functions as documented in the reference (de Almeida Peres et al. 2011). In this context, the function set is deliberately chosen to incorporate four fundamental arithmetic operators (‘ + ’ addition, ‘ − ’ subtraction, ‘ × "multiplication, and ‘ ÷ ’ division) alongside a selection of basic mathematical functions: exponentiation (X2, X3, X4, X5), square root (Sqrt), power of 10 (Pow10), natural logarithm (Ln), base-10 logarithm (Log), and cubic root (3Rt).

Selecting the chromosomal tree's structure, namely the length of the head and the quantity of genes, is the third and most important step. At first, two distinct head lengths and a single gene were used in the GEP model. Then, as each model's training and testing results were tracked, the number of genes and head lengths gradually increased in each run. The study's number of genes and head length were ascertained after multiple trials, as shown in Table 6.

Table 6

GEP parameters adopted in the proposed model

NO.Parameter definitionGEP
1. Function set +, −, *, /, X2, X3, X4, X5, EXP, Sqrt, Pow10, Ln, Log, 3Rt 
2. Number of chromosomes 64 
3. Head size 
4. Number of genes 
5. Linking function Addition 
6. Generation without change 2,000 
7. Number of tries 
8. Max complexity (Genes) 
9. Mutation rate 0.00138 
10. Inversion rate 0.00546 
11. One-point recombination rate 0.00277 
12. Two-point recombination rate 0.00277 
13. Gene recombination rate 0.00277 
14. Gene transposition rate 0.00277 
NO.Parameter definitionGEP
1. Function set +, −, *, /, X2, X3, X4, X5, EXP, Sqrt, Pow10, Ln, Log, 3Rt 
2. Number of chromosomes 64 
3. Head size 
4. Number of genes 
5. Linking function Addition 
6. Generation without change 2,000 
7. Number of tries 
8. Max complexity (Genes) 
9. Mutation rate 0.00138 
10. Inversion rate 0.00546 
11. One-point recombination rate 0.00277 
12. Two-point recombination rate 0.00277 
13. Gene recombination rate 0.00277 
14. Gene transposition rate 0.00277 

The selection of the linking function is the fourth crucial step. The linking function in this study was an addition. The final and most crucial step is choosing the set of genetic operators that introduce variation and figuring out their rates. Table 6 provides the combinations of all genetic operators. Figure 3 depicts the expression tree of formulations (2)–(7) for the GEP model.
(12)
(13)
(14)
(15)

Derivative ML models

The methodology for ML techniques investigated in this paper is described in Figure 4, and the ML techniques employed in the study are DT, K*, and M5P. These techniques were categorized into rules algorithms, lazy-learning algorithms, and tree-based learning. The experimental database is divided into training and validation sets to construct the ML models and assess their generalization capability. The training and testing processes randomly select 50% of the experimental data for each set. The 50/50 split is a balanced approach that ensures the model is trained on a representative subset of the data while also providing a reliable evaluation of its performance on unseen data. This helps to prevent overfitting and improve the model's overall generalization ability.
Figure 4

Flowchart of the ML techniques.

Figure 4

Flowchart of the ML techniques.

Close modal

Table 7 presents the performance evaluation metrics (Pearson R, MAE, RMSE, RAE, and RRSE) for the different models used in the training and testing phases. These metrics are based on the difference between the actual and predicted values. The analysis shows that the K* model estimates the Cd (coefficient of drag) better than the other models during the training phase, based on the performance assessment indicators. Additionally, the results indicate that the M5P model outperforms the DT model in predicting Cd. The models can be ranked from best to worst during the training stage as K*, M5P, and DT.

Table 7

Performance of ML models for training and testing sets

Statistical parametersTraining set
K*M5PDT
Pearson R 0.9875 0.9748 0.9673 
MAE 0.0291 0.0307 0.0325 
RMSE 0.0391 0.039 0.0427 
RAE 21.39% 22.57% 23.85% 
RRSE 22.76% 22.72% 25.35% 
Testing set 
Pearson R 0.97 0.9731 0.7534 
MAE 0.0436 0.0348 0.0709 
RMSE 0.0579 0.0422 0.1234 
RAE 29.95% 23.92% 48.69% 
RRSE 31.72% 23.11% 67.54% 
Statistical parametersTraining set
K*M5PDT
Pearson R 0.9875 0.9748 0.9673 
MAE 0.0291 0.0307 0.0325 
RMSE 0.0391 0.039 0.0427 
RAE 21.39% 22.57% 23.85% 
RRSE 22.76% 22.72% 25.35% 
Testing set 
Pearson R 0.97 0.9731 0.7534 
MAE 0.0436 0.0348 0.0709 
RMSE 0.0579 0.0422 0.1234 
RAE 29.95% 23.92% 48.69% 
RRSE 31.72% 23.11% 67.54% 

With the lowest MAE of 0.0348, RMSE of 0.0422, RAE of 23.92%, RRSE of 23.11%, and the highest Pearson R of 0.9731, the M5P model performs the best during the testing phase. During testing, the models can be ranked as M5P, K*, and DT, going from best to worst. According to the results, this dataset can be adapted to accurately predict the Cd using the M5P and K* predictive models.

Various methodologies could be employed to evaluate the derived models. According to common performance metrics and the Taylor diagram, the models are evaluated in the current study.

Taylor diagram

The Taylor diagram is a beneficial visual aid for evaluating prediction model performance. Displaying the model's closeness to the reference point, which stands for the actual values, makes it possible to determine the most accurate and dependable (Taylor 2001; Band et al. 2021). Three essential parameters determine a model's location on the Taylor diagram: the correlation coefficient (represented by the radial lines), the standard deviation (shown on the horizontal and vertical axes), and the RMSE, which is shown by the circular lines centered at the reference point. The most accurate model is the one that comes closest to the reference point (Band et al. 2021).

The Taylor diagram used to compare the suggested prediction models is shown in Figure 5. According to the analysis, the most accurate model is the GEP model, which also shows it is closest to the reference point. The models that are placed the furthest away from the reference point are the DT, M5P, K*, LR, and symbolic polynomial regression (SPR) models (as indicated by the red box). Notably, the models with the highest coefficient of determination (R2), lowest RMSE, and smallest standard deviation are the GEP, SPR, and M5P models. According to these results, the most reliable ML-based techniques for predicting the sluice gate coefficient of discharge are GEP, SPR, and M5P.
Figure 5

Taylor diagram for experimental and predicted values.

Figure 5

Taylor diagram for experimental and predicted values.

Close modal
A box plot is an effective visualization tool to evaluate how closely the actual data aligns with predictions from various models, as it shows uniform convergence across different models. Therefore, it is used to assess the models proposed in this study. Figure 6 illustrates that the DT model is the least aligned, whereas the GEP model is the closest to the actual data. This conclusion is based on box plot concepts, such as data spread and range, distribution, and skewness.
Figure 6

Box plot for experimental and predicted values using all techniques.

Figure 6

Box plot for experimental and predicted values using all techniques.

Close modal

Performance metrics

In order to assess a model's predictive accuracy, the most prevalent performance metrics are the coefficient of determination R2, adjusted R2, predicted R2, p-value, and F-value. Table 7 reports the performance metrics obtained for each model. Table 7 presents the outcomes of ML techniques for estimating the Cd. Three ML techniques (K*, M5P, and DT) were employed to estimate Cd. From a statistical standpoint, ML techniques with high R2 values and low error measures generally exhibit strong performance (Harith et al. 2021, 2023, 2024b). The results of a single-factor analysis of variance (ANOVA) indicate a significant difference among the various ML techniques. The statistical analysis in Table 7 demonstrates that K* and M5P algorithms yield high-quality predictions. In comparison, the DT algorithm gives lower predictions, considering its ability to create models that match the data and generate accurate predictions.

The order of accuracy for the different ML techniques in predicting Cd, from highest to lowest, is K*, M5P, and DT, respectively.

In the comparison between all techniques used in this study, it can be observed that the GEP outperforms all technique models, while DT achieves the lowest accuracy. The F-test and p-value assessed the significance of the techniques. The p-values for all techniques were less than 0.05 and equal to 0.000, emphasizing their significance. According to the convention, a model is considered significant if it has a high F-value (Gharehbaghi et al. 2023). The F-values for GEP, K*, SPR and M5P, LR, and DT were 6120.88, 4247.12, 257.81, 3409.59, 118.54, and 562.18, respectively, as shown in Table 8, emphasizing the significant relevance of all techniques.

Table 8

ANOVA results for all techniques

Type of statistical functionLRSPRGEPK*M5PDT
R2, % 78.05 94.95 97.07 95.83 94.85 75.24 
Adjusted R2, % 77.39 94.58 97.05 95.80 94.83 75.11 
Predicted R2, % 74.94 92.60 97.00 95.71 94.73 74.89 
Difference between adj. R2 and Pred. R2 2.45 1.98 0.05 0.09 0.10 0.22 
p-value 0.000 0.000 0.000 0.000 0.000 0.000 
F-value 118.54 257.81 6120.88 4247.12 3409.59 562.18 
Type of statistical functionLRSPRGEPK*M5PDT
R2, % 78.05 94.95 97.07 95.83 94.85 75.24 
Adjusted R2, % 77.39 94.58 97.05 95.80 94.83 75.11 
Predicted R2, % 74.94 92.60 97.00 95.71 94.73 74.89 
Difference between adj. R2 and Pred. R2 2.45 1.98 0.05 0.09 0.10 0.22 
p-value 0.000 0.000 0.000 0.000 0.000 0.000 
F-value 118.54 257.81 6120.88 4247.12 3409.59 562.18 

The adequacy of the fit of the statistical techniques can be evaluated by inspecting the proportions of variance (R2) to confirm that the difference between the predicted R2 and the adjusted R2 does not exceed 20% (Harith 2023; Harith et al. 2024c). The ML techniques demonstrated remarkably high R2, with values of 97.07, 95.83, 94.95, 94.85, 78.05, and 75.24% for GEP, K*, SPR, and M5P, LR, and DT models, respectively. The implemented methodologies effectively captured a remarkably high proportion of the variation, as evidenced by the exceptionally minute residual variance 2.93, 4.17, 5.05, 5.15, 21.95, and 24.76%, respectively.

Statistical analysis confirms the strong performance of the GEP, K*, SPR, and M5P models. This indicates a significant correlation between the Cd values obtained from experiments and those computed by the models.

The comparison results from the tests and predictions are shown in Figure 7(a)–7(f). These figures are confirmed by the statistical analysis above. Figure 8 reveals a tight clustering of data points around the central fitting line. Notably, most data points fall within a 10% margin of error compared to the experimental values for the GEP, K*, SPR, and M5P models. In contrast, the LR and DT models exhibit greater variation. The error values (differences between predicted and actual values) are considerably larger and more scattered for the DT and LR models compared to the other models. This is further highlighted by the numerous data points deviating significantly from the central line and exceeding the 10% error threshold. These deviations suggest substantial discrepancies between the DT and LR model's predictions and the actual measurements.
Figure 7

Actual versus predicted results (train and test) for all technique models. (a) GEP, (b) K*, (c) M5P, (d) SPR, (e) DT, (f) LR.

Figure 7

Actual versus predicted results (train and test) for all technique models. (a) GEP, (b) K*, (c) M5P, (d) SPR, (e) DT, (f) LR.

Close modal
Figure 8

SA concerning independent variables.

Figure 8

SA concerning independent variables.

Close modal

Overall, the GEP, K*, SPR, and M5P models demonstrate superior performance in accurately capturing the relationship between the measured and predicted Cd values.

Sensitivity analysis

In sensitivity analysis (SA), the ML model emerges as the preferred choice, prioritizing practicality over the superior predictive power exhibited by the genetic engineering (GE) model. This analytical endeavor seeks to elucidate the influence exerted by input variables on the dependent variable, denoted as Cd. To achieve this objective within the confines of the sensitivity study, Cd is meticulously assessed by systematically adjusting a single input variable while ensuring all others remain constant at their respective average values (Kiani et al. 2016). For instance, to analyze the effect of cycle number, its value ranges from 1 to 2 while other variables remain fixed at their average values. Upon the meticulous preparation of the SA data, it is subsequently fed into the pre-trained ML model tasked with computing Cd. This process intrinsically facilitates the determination of a corresponding SA parameter for each input variable, achieved with a remarkable degree of ease (Gandomi et al. 2011; Nguyen et al. 2019).
(16)
(17)

For the input variable xi, the computed Cd values at maximum and minimum are denoted as fmax (xi) and fmin(xi), respectively, while other input variables remain constant at their average values. Figure 8 presents the SA results of Cd to internal angle (θ), cycle number (N), and water depth contraction ratio (H/G). As shown in Figure 8, the water depth contraction ratio is the predominant parameter, with an SA index of approximately 63.42%, followed by 31.96% for the internal head angle and 4.62% for the cycle number, indicating that the cycle number parameter has an insignificant effect on Cd values.

Comparison with previous study

In this study, after identifying the best model based on the evaluations performed, it is important to compare it with published models to assess its additional contributions. To this end, and to introduce a reliable model for hydraulic design engineers that offers high confidence and acceptable accuracy, the discharge coefficient (Cd) values predicted by the nonlinear regression (NLR) model proposed by Hashem et al. (2024) were compared with those from the GEP model using the same range of experimental data. Figure 9 shows that the GEP model outperforms the NLR model, as its data uniformity is closer to the actual Cd, indicating that the NLR model is insufficient for a specific data range. Figure 10 illustrates how well the models capture the observed data, while Figure 11 presents statistical metrics to assess reliability and accuracy. Notably, there is a significant variation in performance between the two models, with the statistical indices (R2, MAPE, and RMSE) for the NLR and GEP models being 90.66, 0.0595, 0.0546, and 97.07%, 0.0286, 0.03034, respectively. Consequently, this study introduces a robust model with reliable accuracy for predicting the discharge coefficient with confidence.
Figure 9

Box plot for actual and predicted values adopting GEP and NLR proposed by Hashem et al. (2024).

Figure 9

Box plot for actual and predicted values adopting GEP and NLR proposed by Hashem et al. (2024).

Close modal
Figure 10

Actual versus predicted Cd values of GEP and NLR proposed by Hashem et al. (2024).

Figure 10

Actual versus predicted Cd values of GEP and NLR proposed by Hashem et al. (2024).

Close modal
Figure 11

Comparison of metric values for GEP and NLR models.

Figure 11

Comparison of metric values for GEP and NLR models.

Close modal

This study successfully applied regression and ML techniques to predict the discharge coefficient (Cd) of labyrinth sluice gates. The results demonstrate the superiority of ML models, particularly GEP, in accurately forecasting Cd based on fundamental gate properties.

An SA revealed that the water depth contraction ratio (H/G) is the most influential variable, accounting for 63% of the total variation in Cd. The internal head angle (θ) contributes 32%, while the cycle number (N) has a relatively minor impact of 5%. These findings provide valuable insights for the design and operation of labyrinth sluice gates, enabling hydraulic engineers to accurately estimate Cd and optimize gate performance.

While the GEP model offers exceptional accuracy, expanding the training dataset with a wider range of experimental conditions would further enhance its versatility and applicability. This study contributes to the advancement of water resource management and flood control by providing a reliable and efficient tool for predicting the discharge characteristics of labyrinth sluice gates.

There is no fund in this manuscript.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Abbaszadeh
H.
,
Norouzi
R.
,
Süme
V.
,
Kuriqi
A.
,
Daneshfaraz
R.
&
Abraham
J.
(
2023
)
Sill role effect on the flow characteristics (experimental and regression model analytical)
,
Fluids
,
8
,
235
235
.
https://doi.org/10.3390/fluids8080235
.
Alfatlawi
T. J. M.
,
Hashem
T.
&
Hasan
Z. H.
(
2023
)
Discharge coefficient of symmetrical stepped and triangular labyrinth side weirs in a subcritical flow regime
,
Journal of Irrigation and Drainage Engineering
,
149
,
1
12
.
https://doi.org/10.1061/jidedh.ireng-9627
.
Ayati
A. H.
,
Ranginkaman
M. H.
,
Bakhshipour
A. E.
&
Haghighi
A.
(
2019
)
Transient measurement site design in pipe networks using the decision table method (DTM)
,
Journal of Hydraulic Structures
,
5
,
32
48
.
https://doi.org/10.22055/jhs.2019.29402.1107
.
Band
S. S.
,
Heggy
E.
,
Bateni
S. M.
,
Karami
H.
,
Rabiee
M.
,
Samadianfard
S.
,
Chau
K. W.
&
Mosavi
A.
(
2021
)
Groundwater level prediction in arid areas using wavelet analysis and Gaussian process regression
,
Engineering Applications of Computational Fluid Mechanics
,
15
,
1147
1158
.
https://doi.org/10.1080/19942060.2021.1944913
.
Ciftci
O. N.
,
Fadıloğlu
S.
,
Göğüş
F.
&
Guven
A.
(
2009
)
Genetic programming approach to predict a model acidolysis system
,
Engineering Applications of Artificial Intelligence
,
22
,
759
766
.
https://doi.org/10.1016/j.engappai.2009.01.010
.
Daneshfaraz
R.
,
Norouzi
R.
,
Abbaszadeh
H.
,
Kuriqi
A.
&
Di Francesco
S.
(
2022
)
Influence of sill on the hydraulic regime in sluice gates: An experimental and numerical analysis
,
Fluids
,
7
,
244
244
.
https://doi.org/10.3390/fluids7070244
.
Daneshfaraz
R.
,
Norouzi
R.
,
Ebadzadeh
P.
&
Kuriqi
A.
(
2023
)
Influence of sill integration in labyrinth sluice gate hydraulic performance
,
Innovative Infrastructure Solutions
,
8
,
1
14
.
https://doi.org/10.1007/s41062-023-01083-z
.
de Almeida Peres
M. A.
,
de Alencar Barreira
I.
,
Santos
T. C. F.
,
de Almeida Filho
A. J.
&
de Oliveira
A. B.
(
2011
)
Teaching psychiatry and the disciplinary power of religious nursing hospice pedro ii during the second reign
,
Texto & Contexto Enfermagem
,
20
(
4
),
1
9
.
https://doi.org/10.1590/s0104-07072011000400008
.
Emami
S.
,
Emami
H.
&
Parsa
J.
(
2023
)
LXGB: A machine learning algorithm for estimating the discharge coefficient of pseudo-cosine labyrinth weir
,
Scientific Reports
,
13
,
1
14
.
https://doi.org/10.1038/s41598-023-39272-6
.
Fatehi-Nobarian
B.
&
Moradinia
S. F.
(
2024
)
Wavelet-ANN hybrid model evaluation in seepage prediction in nonhomogeneous earth dams
,
Water Practice & Technology
,
99
,
wpt2024152
.
Fatehi-Nobarian
B.
,
Nourani
V.
&
Ng
A.
(
2023
)
Application of meta-heuristic methods in the optimization of geometrical sections in trapezoidal channels in jump energy loss
,
AQUA – Water Infrastructure, Ecosystems and Society
,
72
(
8
),
1539
1552
.
Ferreira
C.
(
2004
) Gene Expression Programming and the Evolution of Computer Programs.
In: de Castro, L. N. & Von Zuben, F. J. (eds.), Recent Developments in Biologically Inspired Computing, Deerfield Beach, FL: Idea Group Publishing
, pp.
82
103
.
https://doi.org/10.4018/978-1-59140-312-8.ch005
.
Flom
P. L.
&
Cassell
D. L.
(
2007
).
Stopping stepwise: Why stepwise and similar selection methods are bad, and what you should use. In NorthEast SAS Users Group Inc 20th Annual Conference, Baltimore, MD, 11–14 November 2007, pp. 11–14
.
Gandomi
A. H.
,
Tabatabaei
S. M.
,
Moradian
M. H.
,
Radfar
A.
&
Alavi
A. H.
(
2011
)
A new prediction model for the load capacity of castellated steel beams
,
Journal of Constructional Steel Research
,
67
,
1096
1105
.
https://doi.org/10.1016/j.jcsr.2011.01.014
.
Gharehbaghi
A.
,
Ghasemlounia
R.
,
Afaridegan
E.
,
Haghiabi
A. H.
,
Mandala
V.
,
Azamathulla
H. M.
&
Parsaie
A.
(
2023
)
A comparison of artificial intelligence approaches in predicting discharge coefficient of streamlined weirs
,
Journal of Hydroinformatics
,
25
,
1513
1530
.
https://doi.org/10.2166/hydro.2023.063
.
Ghorbani
M. A.
,
Salmasi
F.
,
Saggi
M. K.
,
Bhatia
A. S.
,
Kahya
E.
&
Norouzi
R.
(
2020
)
Deep learning under H2O framework: A novel approach for quantitative analysis of discharge coefficient in sluice gates
,
Journal of Hydroinformatics
,
22
,
1603
1619
.
https://doi.org/10.2166/hydro.2020.003
.
Harith
I. K.
(
2023
)
Optimization of quaternary blended cement for eco-sustainable concrete mixes using response surface methodology
,
The Arabian Journal for Science and Engineering
,
48
,
14079
14094
.
https://doi.org/10.1007/s13369-023-08071-6
.
Harith
I. K.
,
Hassan
M. S.
&
Hasan
S. S.
(
2021
)
Liquid nitrogen effect on the fresh concrete properties in hot weathering concrete
,
Innovative Infrastructure Solutions
,
7
(
1
),
127
.
https://doi.org/10.1007/s41062-021-00731-6
.
Harith
I. K.
,
Hussein
M. J.
&
Hashim
M. S.
(
2022
)
Optimization of the synergistic effect of micro silica and fly ash on the behavior of concrete using response surface method
,
Open Engineering
,
12
,
923
932
.
https://doi.org/10.1515/eng-2022-0332
.
Harith
I. K.
,
Hassan
M. S.
,
Hasan
S. S.
&
Majdi
A.
(
2023
)
Optimization of liquid nitrogen dosage to cool concrete made with hybrid blends of nanosilica and fly ash using response surface method
,
Innovative Infrastructure Solutions
,
8
(
5
),
138
.
https://doi.org/10.1007/s41062-023-01107-8
.
Harith
I. K.
,
Nadir
W.
,
Salah
M. S.
&
Hussien
M. L.
(
2024a
)
Prediction of high-performance concrete strength using machine learning with hierarchical regression
,
Multiscale and Multidisciplinary Modeling Experiments and Design
,
7
,
4911
4922
.
https://doi.org/10.1007/s41939-024-00467-7
.
Harith
I. K.
,
Nadir
W.
,
Salah
M. S.
&
Majdi
A.
(
2024b
)
Estimating the joint shear strength of exterior beam-column joints using artificial neural networks via experimental results
,
Innovative Infrastructure Solutions
,
9
(
2
),
38
.
https://doi.org/10.1007/s41062-023-01351-y
.
Harith
I. K.
,
Abbas
Z. H.
,
Hamzah
M. K.
&
Hussien
M. L.
(
2024c
)
Comparison of artificial neural network and hierarchical regression in prediction compressive strength of self-compacting concrete with fly ash
,
Innovative Infrastructure Solutions
,
9
(
3
),
62
.
https://doi.org/10.1007/s41062-024-01367-y
.
Hashem
T.
,
Mohammed
A. Y.
&
Alfatlawi
T. J.
(
2024
)
Hydraulic characteristics of labyrinth sluice gate
,
Flow Measurement and Instrumentation
,
96
,
102556
.
https://doi.org/10.1016/j.flowmeasinst.2024.102556
.
Ho
H. V.
,
Nguyen
D. H.
,
Le
X.-H.
&
Lee
G.
(
2022
)
Multi-step-ahead water level forecasting for operating sluice gates in Hai Duong, Vietnam
,
Environmental Monitoring and Assessment
,
194
(
6
),
442
.
https://doi.org/10.1007/s10661-022-10115-7
.
Kiani
B.
,
Gandomi
A. H.
,
Sajedi
S.
&
Liang
R. Y.
(
2016
)
New formulation of compressive strength of preformed-foam cellular concrete: An evolutionary approach
,
Journal of Materials in Civil Engineering
,
28
,
04016092
.
https://doi.org/10.1061/(asce)mt.1943-5533.0001602
.
Lauria
A.
,
Calomino
F.
,
Alfonsi
G.
&
D'Ippolito
A.
(
2020
)
Discharge coefficients for sluice gates set in weirs at different upstream wall inclinations
,
Water
,
12
,
245
.
https://doi.org/10.3390/w12010245
.
Mansoor
T.
(
2014
)
Free flow below skew sluice gate. International Journal of Engineering Research and Development, 10 (3), 44–52
.
Mohammed
A. Y.
&
Khaleel
M. S.
(
2013
)
Gate lip hydraulics under sluice gate
,
Modern Instrumentation
2
,
16
19
.
https://doi.org/10.4236/mi.2013.21003
.
Mohammed
A. Y.
&
Sharifi
A.
(
2023
)
Estimating discharge coefficient of triangular free overfall using the GMDH technique
,
Water Supply: The Review Journal of the International Water Supply Association
,
23
,
3775
3788
.
https://doi.org/10.2166/ws.2023.218
.
Mohammed
A. Y.
&
Sihag
P.
(
2024
)
Estimating critical depth and discharge over sloping rough end depth using machine learning
,
Journal of Hydroinformatics
,
26
(
3
),
626
640
.
https://doi.org/10.2166/hydro.2024.242
.
Naisheng
L.
,
Tuo
Y.
,
Deng
Y.
&
He
T.
(
2021
)
Principal component analysis and support vector machine on ice entrainment through a sluice gate
.
https://doi.org/10.22541/au.163257463.35451271/v1
.
Neter
J.
(
1983
)
Applied Linear Regression Models.pdf
.
Burr Ridge, IL
:
Richard D. Irwin
.
Nguyen
T. V.
,
Kashani
A.
,
Ngo
T.
&
Bordas
S.
(
2019
)
Deep neural network with high-order neuron for the prediction of foamed concrete strength
,
Computer-Aided Civil and Infrastructure Engineering
,
34
,
316
332
.
https://doi.org/10.1111/mice.12422
.
Painuli
S.
,
Elangovan
M.
&
Sugumaran
V.
(
2014
)
Tool condition monitoring using K-star algorithm
,
Expert Systems With Applications
,
41
,
2638
2643
.
https://doi.org/10.1016/j.eswa.2013.11.005
.
Quinlan
J. R.
(
1992
)
Learning with continuous classes. In Adams, A. & Sterling, L. (eds.) 5th Australian Joint Conference on Artificial Intelligence, Hobart, Tasmania, 16–18 November. Singapore: World Scientific
.
Sadeq Maatooq
J.
&
Yaseen Ojaimi
T.
(
2014
)
Evaluation the hydraulic aspects of stepped labyrinth spillway
,
ETJ
,
32
,
2174
2185
.
https://doi.org/10.30684/etj.32.9A6
.
Salmasi
F.
&
Abraham
J.
(
2020b
)
Expert system for determining discharge coefficients for inclined slide gates using genetic programming
,
Journal of Irrigation and Drainage Engineering-Asce
,
146
,
06020013
.
https://doi.org/10.1061/(asce)ir.1943-4774.0001520
.
Salmasi
F.
&
Abraham
J.
(
2022
)
Multiple nonlinear regression-based functional relationships of energy loss for sluice gates under free and submerged flow conditions
,
International Journal of Environmental Science and Technology
,
19
(
12
),
11829
11842
.
https://doi.org/10.1007/s13762-022-04429-9
.
Salmasi
F.
,
Nouri
M.
,
Sihag
P.
&
Abraham
J.
(
2020
)
Application of SVM, ANN, GRNN, RF, GP and RT models for predicting discharge coefficients of oblique sluice gates using experimental data
.
Water Supply
,
21
,
232
248
.
https://doi.org/10.2166/ws.2020.226
.
Salmasi
F.
,
Abraham
J.
&
Malekzadeh
F.
(
2022
)
Experimental investigation for determination of discharge coefficients for inclined slide gates and comparison with data-driven models
,
Iranian Journal of Science and Technology-Transactions of Civil Engineering
,
46
(
3
),
2495
2509
.
https://doi.org/10.1007/s40996-022-00850-9
.
Shivapur
A. V.
,
Ish
M.
,
Shesha Prakash
M. N.
&
Ish
M.
(
2005
)
Inclined sluice gate for flow measurement
,
ISH Journal of Hydraulic Engineering
,
11
,
46
56
.
https://doi.org/10.1080/09715010.2005.10514768
.
Silva
C. O.
&
Rijo
M.
(
2017
)
Flow rate measurements under sluice gates
,
Journal of Irrigation and Drainage Engineering-ASCE
,
143
,
06017001
.
https://doi.org/10.1061/(asce)ir.1943-4774.0001177
.
Steppert
M.
,
Epple
P.
&
Malcherek
A.
(
2021
)
Sluice gate discharge from momentum balance
,
Journal of Fluids Engineering-Transactions of The ASME
,
144
(
4
),
041101
.
https://doi.org/10.1115/1.4053351
.
Taylor
K. E.
Summarizing multiple aspects of model performance in a single diagram
.
Journal of Geophysical Research
106
,
7183
7192
(
2001
).
https://doi.org/10.1029/2000jd900719
.
Wang
L.
&
Diao
M.
(
2021
)
Discharge coefficient and head loss under sluice gates with small opening
,
Journal of Irrigation and Drainage Engineering-ASCE
,
147
,
04021054
.
https://doi.org/10.1061/(ASCE)ir.1943-4774.0001627
.
Yan
X.
,
Wang
Y.
,
Fan
B.
,
Mohammadian
A.
,
Liu
J.
&
Zhu
Z.
(
2023
)
Data-driven modeling of sluice gate flows using a convolutional neural network
,
Journal of Hydroinformatics
,
25
,
1629
1647
.
https://doi.org/10.2166/hydro.2023.200
.
Zheng
H.
,
Niu
J.
,
Zhu
J.
&
Lu
L.
(
2023
)
Flow prediction of a measurement and control gate based on an optimized back propagation neural network
,
Applied Science
13
,
12313
.
https://doi.org/10.3390/app132212313
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).