## Abstract

Gates in dams and irrigation canals have been used for the purpose of controlling discharge or water surface regulation. To compute the discharge under a gate, discharge coefficient (*C _{d}*) should be first determined precisely. From a novel point of view, this study investigates the effect of sill shape under the vertical sluice gate on

*C*using four artificial intelligence methods, which are used to estimate

_{d}*C*, (i) random forest (RF), (ii) deep learning (DL), (iii) gradient boosting machine (GBM), and (iv) generalized linear model (GLM). A sluice gate along with twelve different forms of sills was fabricated and tested in the University of Tabriz, Iran. Different flow rates were considered in the hydraulic laboratory with four gate openings. As a result, a total of 180 runs could be tested. The results showed that the installation of sill under the vertical gate has a positive effect on flow discharge. Sill shapes can be characterized by their hydraulic radius (

_{d}*R*). Sensitivity analysis among the dimensionless parameters proved that

_{s}*R*/

_{s}*G*(the ratio of the hydraulic radius of the sills with respect to the gate opening) has a significant role in the determination of

*C*. A semi-circular sill shape has a more positive effect on the increase of

_{d}*C*than the other shapes.

_{d}## HIGHLIGHTS

To compute the discharge under a gate, discharge coefficient (Cd) should be first determined precisely. From a novel point of view, this study investigates the effect of sill shape under the vertical sluice gate on Cd using four artificial intelligence methods.

A sluice gate along with twelve different forms of sills were fabricated and tested in the University of Tabriz, Iran.

The installation of sill under the vertical gate has a positive effect on flow discharge.

Sill shapes can be characterized by their hydraulic radius (Rs).

Sensitivity analysis among the dimensionless parameters proved that Rs/G (the ratio of the hydraulic radius of the sills with respect to the gate opening) has a significant role in the determination of Cd. A semi-circular sill shape increases Cd at least 12%. Comparison of artificial intelligence models showed that the DL model appeared to be superior in the estimation of Cd with respect to RF, GBM, GLM models.

## INTRODUCTION

Gated structures are used to control discharge or water surface regulation in irrigation canals, rivers and water released from the dams (Alhamid 1999). Gates are extensively used in irrigation canals, the crest of dam spillways and the flow outlet from the lake to a river. Two common gate types are sluice (vertical or slide) and radial. Flow under the gate is either free surface flow or submerged flow, each having different discharge equations. In the case of free flow conditions, the upstream head is an important factor and in submerged conditions, both upstream and downstream heads are essential for discharge determination (Henry 1950). Estimation of the flow discharge under gates is an essential issue in many water engineering projects. The accurate estimation of such a flow discharge requires a rational discharge coefficient selection.

Characteristics of the flow under the gates have been extensively studied by many researchers, among others Henry (1950), Henderson (1966), Rajratnam & Subramanya (1967), Rajratnam (1997), Swamee (1992) and Ohatsu & Yasuda (1994). Figure 1 shows a longitudinal cross-section of a vertical sluice gate with a circular sill in free flow condition. Shivapur & Shesha Prakash (2005) experimentally showed that the use of an inclined sluice gate can increase flow discharge from the sluice gates. They applied four angles for inclination starting from 0 (vertical), 15, 30, and 45° with the vertical axis. These gates inclined in the upstream direction and were installed in the laboratory flume. This result occurs because more contraction of flow under the inclined gate with respect to commonly vertical gates takes place in the upstream.

Rajaratnam & Humphries (1982) carried out an experimental study on the characteristics of the flow immediately upstream of a vertical sluice gate located perpendicularly across the full width of a rectangular channel. In that study, the geometrical properties of the surface eddy, the pressure defect on the bed and the velocity field in the converging or jet forming region were studied. Sarhan (2013) conducted an experimental study in a laboratory flume to study submerged flow passing the opening between the sill and the gate. Four different heights of trapezoidal sill models were used and one without a sill, the five groups were run with four different gate openings. The value of *C _{d}* ranges from 0.34 to 0.77, with a standard error of 0.0064. Nasehi Oskuyi & Salmasi (2012) employed energy and momentum equations to calculate unknown parameters for sluice gates. They solved these nonlinear equations simultaneously and generated 5,200 data points. Then, by comparing the results with Henry's (1950) diagram, they found a mean absolute percentage error (MAPE) equal to 21.54%.

Salmasi & Abraham (2020) conducted a series of laboratory experiments to determine the discharge coefficient (*C _{d}*) for inclined slide gates. These tests and models were used to evaluate both free and submerged flows. Experiments with inclination angles of 0, 15, 30 and 45° were studied with different gate openings. The collected data are used to develop equations for predicting

*C*. Results show that inclination of slide gates has a progressive effect on

_{d}*C*and increases capacity under the gate. The increase in

_{d}*C*relates to the convergence of the flow through the gate opening. The produced equation via genetic programming (GP) with R

_{d}^{2}and RE of 0.9431 and 0.0014 had optimal efficiency compared to classical multiple regression models. A comparison with other studies for inclination angles of 45 and 60° was also conducted.

In addition, some recent studies showed that the presence of a sill under the gate can improve discharge flow (e.g. Saad 2007; Salmasi *et al.* 2019). This is due to not only the occurrence of more flow contraction under the gate, but also the aerodynamic shapes of these sills that leads the flow of water under the gate.

It can be stated that the mentioned reference by Salmasi *et al.* (2019) is about a laboratory study of the effect of sills for radial gate (and not vertical gate) discharge coefficients. However, the present study deals with discharge coefficient in sluice/vertical gates, not radial gates.

*q*) is calculated as Equation (1):where

*q*is the discharge per unit width of canal;

*C*is the discharge coefficient;

_{d}*G*is the gate opening;

*g*is the acceleration due to gravity;

*H*is the upstream water depth; and

*Z*is the sill height.

A review of the previous studies demonstrates that the determination of discharge coefficient (*C _{d}*) for vertical/sluice gates are provided in charts. The most well-known chart includes Henry's (1950) work. This complexity and non-linearity increases when a sill is placed under the gate. In this case, the geometric variables of the sill are also another parameters in addition to the previous parameters. Thus the drawback of the classic methods for determination of

*C*is providing charts instead of equations and this increases errors or mistakes in finding

_{d}*C*from charts because in most cases interpolation techniques are needed.

_{d}Known as a breakthrough in artificial intelligence techniques, deep learning demonstrates outstanding performance in various applications of speech recognition, image reconditions, natural language processing (e.g. translation, understanding, test questions and answers), multimodal image-text, and games (e.g. Alphago). The H_{2}O model is an open source machine learning framework including deep learning, distributed random forest, gradient boosting machine and a generalized linear model for classification and regression (Candel *et al.* 2016). The generalized linear model (GLM) is a generalization of standard linear regression and one of the scalable machine learning algorithms. H_{2}O's GLM algorithm fits GLMs to the data by maximizing the log-likelihood (Nykodym *et al.* 2018).

In this study, we aimed to investigate the effects of the shape of sills on the coefficient of discharge (*C _{d}*) in vertical slide gates under free-flow conditions. The existence of sills under the gate complicates its hydraulic behavior,

*C*will thus depend on flow hydraulic characteristics as well as the geometry of the sills. In estimating the

_{d}*C*for both sill and non-sill gate conditions, four artificial intelligent methods were selected to be applied, namely: (i) deep learning (DL), (ii) random forest (RF), (iii) gradient boosting machine (GBM), and (iv) generalized linear model (GLM). The applicability of novel machine learning techniques using the H

_{d}_{2}O method is of primary interest in our study. Therefore, we first employed deep learning to determine the discharge coefficient (C

*) for gates in dams and irrigation canals. It was then powered by deep neural networks to deliver significant results. Deep learning requires high-end machines contrary to traditional machine learning algorithms. It is a term for artificial hierarchical neural networks that has recently proven remarkably robust and includes effective algorithms in various domains. Second, we adopted RF, a powerful classification and regression tool, which is currently an active research interest in many studies. RF is an ensemble learning technique in which the performance of several weak learners is boosted via a voting scheme. It refers to a classifier that uses multiple trees to train and predict the samples (Tian*

_{d}*et al.*2019). Third, we applied GBM, which is a hybrid method that incorporates both boosting and bagging approaches. It is also an ensemble learning method, combining a set of weak learners and delivering a predictive performance. Last, the novelty of this study is the evaluation of

*C*in a vertical slide gate comprising sills under the gate and estimation of

_{d}*C*by means of deep learning under the H

_{d}_{2}O framework method.

## MATERIAL AND METHODS

### Experimental setting

Fluid mechanics is more heavily involved with experimental testing than other disciplines because the analytical tools currently available to solve the momentum and energy equations are not capable of providing accurate results. This is particularly evident in turbulent, separating flows. The solutions obtained by utilizing techniques from computational fluid dynamics with the largest computers available yield only fair approximations for turbulent flow problems, hence the need for experimental evaluation and verification. A scale model in hydraulic engineering (as opposed to analogue and mathematical models) uses the method of direct (physical) simulation of (hydraulic) phenomena, (usually) in the same medium as in the prototype. Models are designed and operated according to scaling laws, i.e. conditions that must be satisfied to achieve the desired similarity between model and prototype. The ratio of a variable in prototype to the corresponding variable in the model is the scale factor (Novak & Cabelka 1981).

In the vast majority of cases, design problems associated with hydraulic structures are investigated on geometrically similar scale models, operated according to the Froude law of similarity. The Reynolds number for smooth models should be such that it corresponds to the fully turbulent hydraulically rough prototype value to obtain the correct friction losses.

The experiments were conducted in a flume with Plexiglas walls in the hydraulic laboratory at the University of Tabriz, Iran. The length of the flume was 9.4 m with a width of 0.3 m. The depth of the flume was set to 0.8 m for the first 4 m length and 0.5 m for the remaining 5.4 m length. The flume was equipped with two control gates, one relating to the vertical gate on the section studied and the other at the end of the flume to control downstream water depth. The flow was equipped with a pump with a maximum capacity of 50 L/s and flow was measured by a calibrated triangular weir at downstream. Flow depths were measured with point gauge with a precision of ±0.1 mm (Figure 2).

In addition to non-sill cases, 12 different forms of sills were studied in this study. These sill cross-sections have five different shapes: triangular, trapezoidal, circular, semicircular and rounded upstream faces with a triangular downhill (Figure 3).

Nine of these sills were 5 cm high and the circular sills were 2.35, 3.3, and 8 cm in diameter. A group of varied flow values were considered between 12 and 26 L/s with four gate openings. As a result, it was possible to test a total of 180 runs.

*C*is a function of the following parameters:where is the specific mass of water (kg/m

_{d}^{3}),

*Q*is the discharge (m

^{3}/s),

*b*is the gate width (m),

*g*is the acceleration due to gravity (m/s

^{2}), is the dynamic viscosity of water (N.s/m

^{2}),

*H*is upstream water depth,

*Z*is sill height,

*G*is gate opening and is sill shape factor. In addition, can be related to the sill wetted perimeter (

*p*) and sill hydraulic radius (

*R*) as Equation (3):

_{s}*R*

_{s}*=*

*A/p*, where

*A*is the cross-sectional area of the flow and

*P*is its wetted perimeter. After some simplification and neglecting Reynolds number, Equation (4) can be obtained as follows for the calculation of discharge coefficient:The main advantages of a dimensional analysis of a problem are:

It reduces the number of variables in the problem by combining dimensional variables to form non-dimensional parameters, thus reducing the amount of experimental data required to make correlations of physical phenomena to scalable systems.

To change units from one system to another.

Scaling laws: that allows testing models instead of expensive large full-scale prototypes. There are rules for finding scaling laws or conditions of similarity. According to the principles of dimensional analysis, any prototype can be described by a series of Pi terms or groups that describe the behavior of the system. Common dimensionless groups in fluid mechanics include: Reynolds number, Froude number, Euler number and Mach number.

In an effort to avoid scale effects to determine weir head–discharge relationships, various recommendations have been provided to limit minimum model size (weir height, thickness, and crest radius), flow depth, or dimensionless similitude numbers. For example, when considering scaled models in the laboratory, Castro-Orgaz & Hager (2014) recommended a minimum crest radius (*R*) of 10 mm. Curtis (2016) recommended that the weir height (*P*) be greater than 76 mm. Falvey (2002) recommended that *P* be greater than 300 mm, and that ratios of piezometric head to *P* (*h*/*P*) less than 0.3 would result in discharge estimate errors exceeding +5%.

As mentioned previously, in this study the width of sluice gate along with all sills is 0.3 m. The Reynolds number (*R _{e}*) is high enough and thus

*R*can be neglected from the calculations. Similarity is based on Froude number (

_{e}*F*) and it is anticipated that the prototype-to-model length ratios (scale ratios) of 5–10 (based upon Froude modeling) can be accepted for this sluice gate physical study comprising sills under the sluice gate.

_{r}### Deep learning

_{2}O framework is based on high-level artificial neural networks whose parameters are optimized via back-propagation techniques. Tuning H

_{2}O parameters was carried out by 5-fold cross-validation on the learning data using the H

_{2}O package. A multi-layer DL model was implemented by considering a number of multiple hidden layers and ‘tanh’ activation function was used. Thus, our model was subjected to training with stochastic gradient descent using the back-propagation. As such, each neuron receives a weighted combination

*α*of the

*n*output of the neurons in the previous layer

^{l}*l*as input, with

*w*denoting the weight of the output

_{i}*x*and

_{i}*b*the bias. The weighted combination of Equation (5) is transformed via some activation function, so that the output signal (

*α*) is relayed to the neurons in the subsequent layer. The multilayer feed forward neural network function is denoted by :

*W* is the collection , where denotes the weight matrix connecting layers *i* and *i* + 1 for a network of *N* layers. Similarly *B* is the collection of , where denotes the column vector of biases for layer *i* + 1. is the targeted value and is the observed value.

A Gaussian distribution defined by the continuous probability density is known to be a function for continuous targets. In this study, our DL model consists of three hidden layers and a varied number of hidden units were implemented. The following six parameters were considered to set up the model having the activation function: tanh, sparse as true, hidden layers: three layers using (10, 10, 10) hidden neurons, epochs: 500, nfolds = 5, and Gaussian distribution.

### Random forest

The RF algorithm is one of the decision forest algorithms, which is a fusion of bagging and random subspace. It is considered as one of the most accurate classifiers and is explored for feature selection. Both classification and regression take the average prediction over all of their trees to make a final prediction, whether predicting for a class or numeric value. The number of trees is abbreviated as ntree. In the regression context, Breiman (2001) recommended setting *m _{try}* to be one-third of the number of predictors. For regression models, the prediction error is returned as a mean squared error (MSE). Three tuning parameters are used, i.e. the number of trees ntree = 100, their maximum depth = 10, and nfold = 5.

### Generalized linear models

*β*for the observed data.

The linear regression with gamma family is useful for modeling a positive continuous response variable, where the conditional variance of the response grows with its mean but the coefficient of variation of the response remains constant. It is usually used with the log link , or the inverse link , which is equivalent to the canonical link. However, the value of parameters for GLM model such as family = ‘gamma’, link = ‘inverse’, nfolds = 5 are considered.

### Gradient boosting machine

The GBM is a machine learning technique that combines two powerful tools: gradient-based optimization and boosting. GBM is used for predictive results for regression or classification. It is an ensemble of tree models and provides considerably accurate results. GBM applies weak classification algorithms to incrementally change data and create a series of decision trees. We set six parameters: the number of trees, the learning rate, stopping rounds, distribution, depth of the tree, and nfold. The parameter values are ntrees = 100, learn_rate = 0.01, stopping_rounds = 5, distribution = ‘gamma’, max_depth = 20, nfolds = 5.

### H_{2}O framework

The R software (Team 2013) was implemented in this analysis using the H_{2}O package, which is nothing more than a parallel machine learning package written in Java. It provides bindings via its representational state transfer application programming interface (RESTful API) to Java, Python, and R as well as the web interface. It provides fast, scalable and strong machine learning algorithms including deep learning, random forest, gradient boosting model and a generalized linear model for regression and classification. The three-step (C* _{d}*-H

_{2}O) model is illustrated in Figure 4 and is described below:

#### Data collection and inputs

This initial step is to focus on understanding the study of work and requirements of hydraulic structures. Experiments were performed in a rectangular flume with Plexiglas walls having a length of 9.4 m with 0.3 m width. This phase begins with data collection and proceeds to designing the model for the *C _{d}* for both sill and non-sill gates. In this study, the four input parameters (

*Z/G*,

*H*,

_{1}/P*R*, and

_{s}/G*Rs/H*) were used to calculate the

_{1}*C*value. This phase includes all activities in constructing the final dataset from the initial raw data.

_{d}#### Data modelling

In this phase we divided our dataset into training, validation and testing datasets. The training set consists of 60% as the validation set includes 25% and the testing set consists of 15% of the entire dataset. A variety of modeling techniques were applied such that their parameters were calibrated to optimal values. The DL, RF, GBM and GLM models were trained with the training dataset until reaching a satisfying accuracy. After receiving the best accuracy from the training dataset, the validation dataset is applied to evaluate the model accuracy. After achieving the best accuracy from training and validation datasets, the model is applied to the testing dataset.

#### Data analysis

In this phase, the prediction values can be evaluated. We performed our experiments with a five-fold cross-validation approach in order to train our models with the training, validation and testing data. A discussion concerning the experimental results will be presented in the Results and discussion section below. Some experimental runs were carried out to determine the best combination of a number of activation function and randomly selected parameters in order to avoid overfitting and underfitting.

### Model performance assessment

*et al.*2012) and Legate and McCabe's Index (LMI) (Legates & Mccabe 2013) were taken into consideration with their respective mathematical expressions given in Equations (10)–(14):where is the observed value for

*C*from fabricated physical models. In addition, is the estimated value for

_{d}*C*from artificial intelligence (AI) models, is the average of observed values, is the average of estimated values and

_{d}*N*is the number of observations.

## RESULTS AND DISCUSSION

In all experiments the addition of a sill under the slide gate showed an increase in discharge coefficient (*C _{d}*). For instance, Figure 5 shows the discharge coefficient of the gate with a circular sill of 5 cm in diameter with a comparison of the non-sill type. The circle sill has an increased discharge coefficient of at least 23% and a maximum of 31%. In addition, in Figure 5, two equations with determination coefficients (R

^{2}) equal to 0.87 and 0.91 were fitted for a gate with and without a sill respectively.

For brevity, the relation between (*H–Z*)*/G* vs. *C _{d}* for other sill shapes was not present here. This is because the main focus of this study is an assessment of artificial intelligence performance in the prediction of

*C*. Details of the hydraulics of flow under the vertical slide gate including a sill need computational fluid dynamic (CFD) simulation that is beyond this study.

_{d}The total number of observed data is 180, comprising 12 gates with and without sills. The hydraulic conditions (*H*, *G*, and *Z*) and geometric characteristics of sills (*R _{s}* and

*p*) create rigorous interaction of dimensionless parameters with each other. Thus in this study, the performance of the DL, RF, GBM and GLM models were investigated for prediction of

*C*.

_{d}In terms of the observed values in the training, validation and testing phase, the sluice gate demonstrated the following range of *C _{d}* values: 0.5462–0.7818, 0.53715–0.7851 and 0.5739–0.7833. However, the modeled values of

*C*using the DL model were in the range of 0.5833–0.7828, 0.5718–0.7791 and 0.5993–0.7838 in the training, validation and testing phase.

_{d}It is observed that the RF model *C _{d}* range varies during the training, validation and testing phase, i.e. 0.5636–0.7640, 0.5635–0.7600 and 0.5902–0.7615, respectively. Similarly, the GBM model results show that the values of

*C*are between 0.6134–0.7379, 0.6134–0.7323 and 0.6134–0.7379 in the training, validation and testing phases respectively.

_{d}The GLM model showed that the values of *C _{d}* are between 0.5464–0.7938, 0.5595–0.7970 and 0.5730–0.7950 in the training, validation and testing phase, respectively. However, our results clearly showed that the DL model estimated

*C*values came out closer to the range of the observed values as compared to those acquired from the other models.

_{d}Table 1 presents the five criteria for the assessment of DL, RF, GBM and GLM methods. The performance of the DL method is superior to the others in term of RMSE and MAE. For example, RMSE for DL, RF, GBM and GLM methods are 0.02071, 0.02380, 0.02364 and 0.03418, respectively. To further confirm the accuracy in forecasting *C _{d}*, Table 1 displays the other metrics of the best DL model with (RMSE = 0.01281, MAE = 0.00976) of training, while for the validation dataset metrics they are RMSE = 0.01565, MAE = 0.01118 and RMSE = 0.02071, MAE = 0.01192 during the testing dataset, respectively. Furthermore, the results corroborated the superiority of the RF model as compared to the GBM and GLM models. The accuracy of the RF model in terms of RMSE and MAE for the testing dataset phase are 0.02380 and 0.01560 respectively.

Accuracy criteria . | Training phase . | Validation phase . | Testing phase . | Training phase . | Validation phase . | Testing phase . |
---|---|---|---|---|---|---|

Deep Learning | Random Forest | |||||

RMSE | 0.012 | 0.015 | 0.020 | 0.015 | 0.017 | 0.023 |

MAE | 0.009 | 0.011 | 0.011 | 0.012 | 0.013 | 0.015 |

GBM | GLM | |||||

RMSE | 0.021 | 0.024 | 0.023 | 0.028 | 0.028 | 0.034 |

MAE | 0.018 | 0.020 | 0.019 | 0.021 | 0.022 | 0.021 |

Accuracy criteria . | Training phase . | Validation phase . | Testing phase . | Training phase . | Validation phase . | Testing phase . |
---|---|---|---|---|---|---|

Deep Learning | Random Forest | |||||

RMSE | 0.012 | 0.015 | 0.020 | 0.015 | 0.017 | 0.023 |

MAE | 0.009 | 0.011 | 0.011 | 0.012 | 0.013 | 0.015 |

GBM | GLM | |||||

RMSE | 0.021 | 0.024 | 0.023 | 0.028 | 0.028 | 0.034 |

MAE | 0.018 | 0.020 | 0.019 | 0.021 | 0.022 | 0.021 |

However, it is worth noting that the GBM model presented the best performance of training and testing datasets in comparison to the validation dataset. The training dataset had slightly better performance than the testing dataset, and was demonstrated as RMSE = 0.02166, MAE = 0.01807 and RMSE = 0.02364, MAE = 0.01976, for training and testing, respectively. Moreover, the performance of the GLM model using training, validation and testing phase, with RMSE = 0.02892, MAE = 0.02200 and RMSE = 0.03418, MAE = 0.02190 shows lower accuracy among the considered models.

The scatter plot of the data points against four artificial intelligence methods including DL, RF, GBM, and GLM is provided in Figure 6. Observation data points are close to the line of y = x in the DL, RF, GBM methods, while in the GLM method, the data points are not close to the line of y = x and a high fluctuation of data sets can be observed. Table 2 summarizes the NS (Nash–Sutcliffe coefficient), WI (Willmott's Index of agreement) and Legate and McCabe's Index (LMI) performance metrics for DL, RF, GBM and GLM models using a testing dataset.

Accuracy criteria . | Nash–Sutcliffe coefficient (NS) . | Willmott's Index of agreement (WI) . | Legate and McCabe's Index (LMI) . |
---|---|---|---|

DL | 0.848 | 0.999 | 0.982 |

RF | 0.800 | 0.998 | 0.650 |

GBM | 0.803 | 0.958 | 0.557 |

GLM | 0.588 | 0.969 | 0.509 |

Accuracy criteria . | Nash–Sutcliffe coefficient (NS) . | Willmott's Index of agreement (WI) . | Legate and McCabe's Index (LMI) . |
---|---|---|---|

DL | 0.848 | 0.999 | 0.982 |

RF | 0.800 | 0.998 | 0.650 |

GBM | 0.803 | 0.958 | 0.557 |

GLM | 0.588 | 0.969 | 0.509 |

For comparison, the performance of the four DL, RF, GBM and GLM methods (Taylor's diagram) (Taylor 2001) is provided in Figure 7. Based on Figure 7, for further analysis, Taylor's diagram (TD) is examined for the developed predictive models for *C _{d}* estimation. The best model is the one with the highest correlation (

*r*= 1) near the blue color circular symbol on the

*x*axis. In fact TD relates to three statistical parameters, i.e. contours of constant standard deviation (SD), root mean square error (RMSE) and correlation (r). Figure 7 shows that in training phase DL the model is nearer to the blue point of observed data than the other three points GBM, GLM, and RF. For validation and testing phases, the performance of the two DL and GBM models are approximately similar based on their distances from the blue point of observed data. Thus, both of the DL and GBM models are successful in validation and testing phases in the prediction of

*C*.

_{d}Figure 8 shows the point density plots for observed and estimated values of *C _{d}* using the deep learning (DL) model. The figure shows that the main body of density plot for the values obtained using the DL model is more similar to the observed ones compared to that obtained using the RF, GBM and GLM methods.

The violin plot (Ruskeepaa 2009) was also employed to assess the model performance in estimating the values of discharge coefficient. The violin plot is categorized as a box plot with the integration of a kernel density plot. Figure 9 shows the violin plots for observed and estimated values of *C _{d}* using the DL, RF, GBM and GLM models. The figure clearly shows that the discharge coefficient estimated using DL resembles the observed relative discharge coefficient more than that obtained using the RF, GBM and GLM models.

Figure 10 presents a 3D plot indicating the relative absolute error of observed and estimated *C _{d}* values using the DL, RF, GBM and GLM models in three phases: (1) training, (2) validation and (3) testing stages. Figure 10 shows the relative absolute error (RAE) that is defined as the absolute value of the difference between the estimated value and the observed value using the RF, DL, GBM and GLM models. The minimum values of RAE for DL indicate many similarities of DL estimates with the observed data compared to the RF, GBM and GLM estimates.

Computational complexity theory is the study of the scalability of algorithms, both in general and in a problem-solving sense. Scalability is a characteristic of an organization, system, model, or function that describes its capability to cope and perform well under an increased or expanding workload or scope. Scalability is the measure of a system's ability to increase or decrease in performance and cost in response to changes in application and system processing demands. All the computational complexity of models is represented by Big O notation. However, the amount of memory needed for H_{2}O to run efficiently depends on the hidden layer numbers.

The model complexity of deep learning is O (m × hidden_{layer1} + hidden_{layer1} × hidden_{layer2} + hidden_{layer2} × hidden_{layer3} + hidden_{layer3} × bias). Consider a hidden layer 10 × 10 × 10, with four numeric inputs and one output. It has (4 × 10 + 10 × 10 + 10 × 10 + 10 + 1) = 250 K weights in the modeling process and this shows the complexity of deep learning.

## CONCLUSION AND FUTURE WORK

In the present study, a total of 12 sills with different geometric sections were tested under a vertical slide gate. Their geometric sections were selected as being circular, semicircular, triangular, trapezoidal and rounded upstream face with a triangular shape in the downstream face. The dependent variable is discharge coefficient and the total number of the laboratory data was 180. The performance of the four artificial intelligent methods including random forest (RF), deep learning (DL), gradient boosting machine (GBM) and generalized linear model (GLM) were investigated. Our analysis results showed that the presence of the sill under the vertical gate has a positive effect on the flow characteristics. This means that it increases the coefficient of discharge. The performance of the DL method is better than the RF, GBM and GLM models. Moreover, the DL model has avoided the overfitting and underfitting issue by giving more accuracy on training, validation, and testing, respectively. It has shown higher robustness than conventional approaches. The main contribution of this paper is the development of innovative deep learning using an H_{2}O framework to predict the coefficient of discharge. To the best of our knowledge, this is one of the few comprehensive studies that examine the efficiency of deep learning, random forest, gradient boosting machine and a generalized linear model for *C _{d}* analysis and prediction. As a direction for further research, we will apply a more novel framework of H

_{2}O based Driverless AI and sparkling water techniques using this study.

The novelty of this study is that it explores the application of machine learning with Deep Learning (DL), for key problems in hydraulics. Study of the previous literature review demonstrated that the application of DL in estimation of discharge coefficient (*C _{d}*) has not been carried out yet.

It can be noted that sill shapes are characterized by their hydraulic radius defined by *R _{s}*

*=*

*A/p*, where

*A*is the cross-sectional area of the flow and

*P*is the wetted perimeter. Thus the proposed models can be useful for other shapes too. Meanwhile, this work can be extended for more shape of sills in future.

## ACKNOWLEDGEMENTS

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## REFERENCES

_{2}O

_{2}O