Abstract
In the present research, three different data-driven models (DDMs) are developed to predict the discharge coefficient of streamlined weirs (Cdstw). Some machine-learning methods (MLMs) and intelligent optimization models (IOMs) such as Random Forest (RF), Adaptive Neuro-Fuzzy Inference System (ANFIS), and gene expression program (GEP) methods are employed for the prediction of Cdstw. To identify input variables for the prediction of Cdstw by these DMMs, among potential parameters on Cdstw, the most effective ones including geometric features of streamlined weirs, relative eccentricity (λ), downstream slope angle (β), and water head over the crest of the weir (h1) are determined by applying Buckingham π-theorem and cosine amplitude analyses. In this modeling, by changing architectures and fundamental parameters of the aforesaid approaches, many scenarios are defined to obtain ideal estimation results. According to statistical metrics and scatter plot, the GEP model is determined as a superior method to estimate Cdstw with high performance and accuracy. It yields an R2 of 0.97, a Total Grade (TG) of 20, RMSE of 0.032, and MAE of 0.024. Besides, the generated mathematical equation for Cdstw in the best scenario by GEP is likened to the corresponding measured ones and the differences are within 0–10%.
HIGHLIGHT
The discharge coefficient of streamlined weirs (Cdstw) was analyzed using intelligent models.
Random forest (RF), adaptive neuro-fuzzy inference system (ANFIS), and gene expression programming (GEP) methods are used and compared for predicting Cdstw.
The GEP model has been identified as a superior method for estimating Cdstw with high performance and accuracy.
NOMENCLATURE
- Cdstw
discharge coefficient of streamlined weirs (–)
- RF
Random forest
- ANFIS
Adaptive neuro-fuzzy inference system
- Lw
length of the weir (cm)
- Ww
height of the weir (cm)
- H1
total head over the crest of the weir (cm)
- Q
discharges (L/s)
- K
number of discrete trees in random forest
- GEP
gene expression program
- RMSE
root mean square error (m)
- λ
relative eccentricity (–)
- yi
predicted values of Cdstw
- xi
observed values of Cdstw
- μx
average of observed values of Cdstw
- μy
average of estimated values of Cdstw
- β
downstream slope angle
- N
Number of datasets
- R2
Determination coefficient (–)
- Y1
Upstream flow depths (cm)
- Y2
Flow depths over weirs crest (cm)
- Y3
Downstream flow depths (cm)
- h1
Water head over the crest of the weir (cm)
- TG
Total grade
- MBE
Mean bias error (–)
- CC
Lin's concordance correlation coefficient
- MAE
Mean absolute error (–)
- bw
crest width
- σx
Standard deviation of observed values of Cdstw
- σy
Standard deviation of estimated values of Cdstw
- ANNs
artificial neural networks
INTRODUCTION
Weirs diffuse rainwater, and thus reduce erosion. Besides, water flows more slowly and infiltrates into the soil, and it helps to create groundwater reserves available for agricultural use. Precise control of water emissions of weirs is an effective method of precision irrigation (Kröger et al. 2011). Weirs are considered to be the most common hydraulic structure worldwide and are also commonly used to improve and develop artificial irrigation methods in barren valley areas. They are normally divided into three main sets, namely short-crested weirs, sharp-crested weirs, and broad-crested weirs. Short-crested weirs are further classified into three diverse types, namely streamlined weirs, circular-crested weirs, and overflow (Ogee) weirs (Bos 1976). Streamlined weirs are a specific sort of short-crested weirs inspired by the airfoil concept. They have some advantages contrasted to the other kinds of weirs such as consistency and fewer vacillations of the free surface of water and especially high discharge coefficient (Cd).
The discharge coefficient (Cd) has been assumed to depict the remaining energy losses that have not been pondered in the derivation like turbulence circumstances on account of surface tension, viscous effects, and three-dimensional flow structures ahead of the weir plate (Aydin et al. 2011). Precise indication of Cd has a very significant impact in assessing the discharge of flow over the weirs. Therefore, it is substantial to compute Cd correctly.
Frequent investigations were performed on different forms of weirs that normally were emphasized on Cd. In this respect, lots of experimentally based formulations have been recommended to determine Cd in open channels at early times. Rajaratnam & Muralidhar (1971) performed a variety of exact measurements on velocity and pressure fields in a curved section of flow in the vicinity of the crest of a rectangular sharp-crested weir. They remarked that the presented measurements would be constructive in the improvement of presumptions for curved open-channel flow. Saadatnejadgharahassanlou et al. (2017, 2020) investigated experimentally and numerically hydraulic attributes of a special form of sharp-crested V-notch weir (SCVW). They reported that along with the results, SCVW outdid normal weirs. Salmasi (2018) evaluated the impacts of downstream submergence and apron elevation on Cd of Ogee weirs. The results demonstrated that the relationship of head–discharge was fairly self-determining the downstream submergence when submergence levels were smaller than 0.8. Furthermore, Cd depended on the spillway crest and vertical distance among weir height and downstream apron. Haghiabi et al. (2018) inspected the hydraulic attributes of a circular-crested stepped weir. They affirmed that the mentioned kind of weir might squander the flow energy up to 90%. Abdollahi et al. (2017) simulated the flow field around labyrinth side weirs with guide vanes through OpenFOAM software. They reported that based on the outcomes, the pick Cd was attained when the vane plates were situated vertically to the flow direction in the downstream end of the weirs across the main channel. Sutopo et al. (2022) examined the impact of spillway width on flow elevation at the weir crest on the basis of flood discharge design for the Probable Maximum Flood (PMF) return period by flood routing hydrologically at the Cacaban Dam (Indonesia). The outcomes of the investigation approved that snowballing spillway crest widths resulted in lessening flow evaluation at the spillway crest, and consequently amplifying outflow discharge. Yamini et al. (2020) evaluated the effect of hydrodynamics flow on flip buckets spillway for flood control in large dam reservoirs. They presented an equation to determine pressure distribution, particularly the position of maximum dynamic pressure on the bed of flip buckets with a high radius, as a function of bucket geometry and flow depth.
Rao & Rao (1973) examined the streamlined weir performance (termed hydrofoil weirs), in both submerged and free overflow conditions. The results of experiments confirmed a higher Cd in comparison with other kinds of weirs. Bagheri & Kabiri-Samani (2020a, 2020b) performed widespread experimental and numerical works to evaluate the characteristics of streamlined weirs. Experimental outcomes of steady flow discharge confirmed that upstream flow heads on streamlined weirs corresponding to diverse relative eccentricities were particularly constant, which indicated almost fixed Cd by changes of comparative eccentricities.
Along with the literature review, it can be discerned that most researchers proposed nonlinear mathematical equations to compute the amount of Cd approximately. In these equations, Cd was represented with self-determining variables. Since non-linearity integration is concerned, the recommended equations for Cd have typically some definite restraints. Because of the high inconsistency of Cd on the weak ability of empirical formulas and the extreme significance of Cd, researchers presented nonlinear schemes, such as machine-learning models (MLMs) and intelligent optimization models (IOMs). MLMs and IOMs are known as dominant stand-in techniques to explain indefinite engineering problems, particularly appropriate in addressing the elaborate and nonlinear performance of Cd.
Of artificial intelligence (AI) algorithms, random forest (RF) algorithm, adaptive neuro-fuzzy inference system (ANFIS), and gene expression program (GEP) have a particular status. RF algorithm was initially suggested by Breiman (1996) and is considered an admirable method with an extremely simple and flexible structure, yet more cost-effective for calibration and higher exactness in forecasting. It is a developed kind of decision tree (DT) algorithm with ensemble concepts, which utilizes numerous circles of DT to map relationships among highly nonlinear variables in big datasets to solve various complicated engineering problems (Breiman 2001). GEP is a biologically developed IOM that has the satisfactory capability to compute parameters with a nonlinear relationship. It has been widely employed for forecasting Cd in diverse weirs. Fuzzy logic (FL) method has particular significance in modeling and controlling the most intricate nonlinear systems (Zadeh 1993). A combination of fuzzy systems and artificial neural networks is called ANFIS presented by Jang (1993). It is a multilayer feed–forward network in which each neuron implements a specific purpose on received signals. Both square and circle node symbols are employed to characterize diverse features of adaptive learning. To execute preferred input–output attributes, the adaptive learning factors are reorganized on the basis of the hybrid learning rule which is an incorporation of the back-propagation gradient descent and the least square error techniques (Jang 1993; Hanbay et al. 2009).
Recently, different AI methods have been developed to predict discharge coefficient of diverse weirs such as labyrinth weirs (Norouzi et al. 2019; Zounemat-Kermani et al. 2019), gated piano key weirs (Akbari et al. 2019), side weirs on converging channels (Zarei et al. 2020), oblique weir (Norouzi et al. 2020), broad-crested weirs with cross-section rectangular and suppressed (Nourani et al. 2021), SCVW (Gharehbaghi & Ghasemlounia 2022). Salmasi et al. (2013) examined Cd in the compound rectangular BCW by employing AI approaches. The results confirmed that GEP was more precise than those of AI methods. Roushangar et al. (2018) computed Cd of stepped spillways under skimming flow and nappe regime. The results affirmed that the GEP had strong potential in modeling Cd via data acquired from physical models. Salazar & Crookston (2018, 2019) estimated Cd of arced labyrinth weirs by using various MLMs through some input variables including the head over crest, angle of cycle arc, and angle of cycle sidewall. They fed linearity to their models and reported the superiority of RF in comparison with other applied methods for forecasting Cd in the application of area. Kumar et al. (2020) and Aein et al. (2020) appraised Cd in the combined weir-gate and piano key weir by using several MLMs in different flow circumstances, respectively. They declared that the best agreement was gained between measured and forecasted values of Cd by the RF method. Chen et al. (2022) developed different traditional and hybrid machine-learning–deep learning (ML-DL) algorithms to forecast discharge coefficient of streamlined weirs (Cdstw). The results confirmed that the proposed three-layer classified DL algorithm comprising of a convolutional layer united with two subsequent gated recurrent unit (GRU) levels, which is also hybridized by linear regression (LR) method (i.e., LR-CGRU), outperformed markedly in comparison with the algebraic equations presented by Bagheri & Kabiri-Samani (2020a) and Carollo & Ferro (2021).
Although computational fluid dynamics (CFD) has gotten incredible considerations from both industry and academic worlds to estimate variables in fluid domains, it agonizes on behalf of computationally difficult processes and an obligation of reflective theoretical understanding in the fluid mechanics sphere (Gharehbaghi 2016; Saadatnejadgharahassanlou et al. 2017; Dasineh et al. 2021; Gharehbaghi & Ghasemlounia 2022). Against the overflow and circular-crested weirs, a wide-ranging scarcity of investigations regarding flow features over the streamlined weirs and most characteristics of the streamlined weirs are still unknown. To get rid of restrictions of empirical relationships and CFD models concerning geometric and hydraulic parameters for discharge coefficient based on experimental or hydraulic models, in the present research, the Cdstw in steady, aerated, and free overflow situations in an open channel are predicted by using RF, ANFIS, and GEP approaches. In this direction, by combination of several geometric and hydraulic parameters affecting hydraulic operations of streamlined weirs and by tuning the structures and key hyperparameters of these methods, several scenarios are defined. It is vital to mention that all key hyperparameters are chosen via a trial-and-error procedure to accomplish the ideal construction of the methods used. In this regard, 120 observation data are employed in the mentioned approaches to evaluate Cdstw with regard to the dimensionless parameters which affect the process of estimating Cdstw. The main contributions of this work are as follow:
Identify the most effective variables on Cdstw among a list of potential geometric and hydraulic parameters using preprocessing methods.
To develop suitable AI methods to compute Cdstw in steady, aerated, and free overflow situations in an open channel using most effective variables specified.
To determine an optimal value of hyperparameters and architecture of models developed via the algorithm-tuning process for better configuration and decrease the effect of underfitting or overfitting problems.
To match the experimental results of models developed to distinguish the attributes of the optimum method through statistical evaluation metrics.
The remaining contents are prearranged as follows. Section 2 presents the experimental framework and measuring process. Section 3 presents the theoretical approach for the head–discharge equation for the short-crested weirs and Joukowsky transform function in streamlined weirs. Section 4 presents the application of dimensional analysis in finding potential parameters on Cdstw. Section 5 presents a sensitivity analysis to pick the most significant predictor variables on Cdstw by data-driven models (DDMs) proposed. Section 6 presents an overview of DDMs developed to estimate Cdstw. Section 7 presents performance evaluation metrics used to compare the models' performance in estimation of Cdstw. Section 8 presents validation of DDMs developed statistically and graphically. Section 9 presents performance comparison of the methods developed using a model scoring procedure. The last section concludes the current research.
EXPERIMENTAL FRAMEWORK
In Figure 1, Y1, Y2, and Y3 are upstream flow depths, flow depths over weirs crest, and downstream flow depths (cm), respectively. Y1 and Y3 are gauged at the nearest segments close to the structure where streamlines' curving is trivial. Lw and Ww are the length and height of the weir (cm), respectively. Moreover, h1 and H1 are water head and total head over the crest of the weir (cm), respectively. All discharges (Q) were gauged by using an electromagnetic flowmeter in the measuring basin with a precision of ±0.5%. The dimensions of the streamlined weirs and the range of applied observation data are provided in Table 1.
Cdstw . | h1 (cm) . | β° (–) . | Q(l/s) . | λ (–) . | Lw (cm) . |
---|---|---|---|---|---|
0.87–1.31 | 4–21.6 | 0, 30, 60 | 5.2–77.7 | 0.25, 0.5, 1, 1.25 | 40.2, 53.3, 64.1, 71.2 |
Cdstw . | h1 (cm) . | β° (–) . | Q(l/s) . | λ (–) . | Lw (cm) . |
---|---|---|---|---|---|
0.87–1.31 | 4–21.6 | 0, 30, 60 | 5.2–77.7 | 0.25, 0.5, 1, 1.25 | 40.2, 53.3, 64.1, 71.2 |
Interested readers can refer to Bagheri & Kabiri-Samani (2020a) for more details about the experimental setup.
THEORETICAL APPROACHES
As aforesaid, streamlined weirs are inspired by the airfoil concept. In the current paper, 12 streamlined weir prototypes using Joukowsky transform function are constructed to conduct experimental works. Practically, to evaluate the main factors in the design of an airfoil, Joukowsky transform function has been employed. In the Joukowsky transform function, the relative eccentricity λ is a significant parameter, which can depict the weir geometry. Interested readers can refer to Bagheri & Kabiri-Samani (2020a) for more information about the concept of Joukowsky transform function.
DIMENSIONAL ANALYSIS
DATA AND SENSITIVITY ANALYSIS
Because the performance of any simulation method in the precise predicting of target parameter chiefly count on a suitable choice of predictor variables, unsuitable picking could adversely influence the ability of the method. Thus, in the current research, the most significance predictor variables for estimation of Cdstw by DDMs are selected using cosine amplitude sensitivity analysis.
Statistical indices . | h1/Ww . | λ . | β . | bw/Lw . | Cdstw . |
---|---|---|---|---|---|
Mean | 0.71 | 0.56 | 15 | 0.01 | 1.1 |
Minimum | 0.13 | 0.13 | 0 | 0.01 | 0.87 |
Maximum | 3.41 | 1 | 60 | 0.01 | 1.31 |
Std. deviation | 0.59 | 0.34 | 23.01 | 0 | 0.11 |
Coefficient of variation | 0.82 | 0.6 | 1.53 | 0.21 | 0.1 |
Skewness | 2.39 | 0.23 | 1.14 | 0.22 | −0.02 |
Statistical indices . | h1/Ww . | λ . | β . | bw/Lw . | Cdstw . |
---|---|---|---|---|---|
Mean | 0.71 | 0.56 | 15 | 0.01 | 1.1 |
Minimum | 0.13 | 0.13 | 0 | 0.01 | 0.87 |
Maximum | 3.41 | 1 | 60 | 0.01 | 1.31 |
Std. deviation | 0.59 | 0.34 | 23.01 | 0 | 0.11 |
Coefficient of variation | 0.82 | 0.6 | 1.53 | 0.21 | 0.1 |
Skewness | 2.39 | 0.23 | 1.14 | 0.22 | −0.02 |
MODELING
Due to the complicated nonlinear nature in function approximation of relationship amid Cdstw with the parameters used in Equation (6), precise modeling and analysis are necessitated to cope with data series. As such, ANFIS, RF, and GEP methods are developed to evaluate Cdstw. Both sides of Equation (6) are firstly normalized to zero mean and unit variance as the recommendation by Lawrence et al. (1997). Then, 70% (84) of recorded data are randomly used in the training phase and the other 30% (36) are employed in the testing phase.
Overview of the RF model
RF is a developed and widespread ensemble technique that contemplates as a forest containing numerous simple decision trees (DT) grown in parallel. RF is suitable for forecasting and classification issues (Cutler et al. 2007).
The RF algorithm, by transforming and constantly altering the factors affecting the target parameter, causes the generation of many decision trees, and after that, all trees are united for the prediction mission. By growing the number of trees, the impact of the overfitting problem and error rate are decreased consequently. It operates a bagging process to pick random samples of parameters meant for the training dataset (Trigila et al. 2015). To make a relationship among different parameters, it categorizes the dataset in the initial phase, and afterward initiates to produce leaf nodes and roots in a downward path, respectively (Diaz-Uriarte & De Andres 2006). Specifying features is the chief mission of each node and leaf that describes inquiries about input and target parameters. To stipulate a set of responses, the leaves of trees are employed (Al-Juboori 2019).
Model development
Because there are no standards to predetermine a suitable value for K in a given dataset, its optimal value should be found through a trial-and-error way to get the ideal structure for an RF model. Several scenarios with diverse steps are regarded for K value. R2 and RMSE grades of each scenario are employed as evaluation metrics and consequently, a scenario that has the maximum R2 and minimum RMSE amount is taken into account as the optimal one. In the current study, inputs cover the experimental data (i.e., h1/Ww, bw/Lw, β, λ) and the target is Cdstw, thus the value of m is set to 4.
Overview of the ANFIS
Rule 1: If x is A1 and y is B1 Then f1=p1x+q1y+r1
Rule 2: If x is A2 and y is B2 Then f2=p2x+q2y+r2
In ANFIS, the rules are regular, but the form and numbers of MFs are optimized. To implement ANFIS, the number of input datasets must be less than six (Kisi & Sanikhani 2015).
Model development
In this study, ANFIS toolbox in MATLAB 2019b with Sugeno-Takagi FIS model is applied to predict Cdstw. In this regard, the dimensionless independent experimental parameters β, and λ are used as the input variables.
To organize the input datasets and for creating fuzzy rules in the ANFIS model, there are two general schemes including subtractive clustering (SC) and grid partition (GP). In this research, the GP method is utilized to generate FIS. To obtain an appropriate ANFIS structure, suitable MFs and an optimal number of MFs for both input/output datasets should be employed. However, an appropriate MFs and their optimal numbers ought to be determined by the trial-and-error procedure. In this direction, several different scenarios are characterized by the user via a trial-and-error process to accomplish the ideal structure.
In total, eight different MFs including, Trimf (Triangular), Trapmf (trapezoid), Gbellmf (Generalized bell), Gaussmf (Gaussian), Gauss2mf (two Gaussian), Pimf (Pi-shaped), Desigmf, and Psigmf for input parameters are employed to develop various scenarios. In all scenarios, the number of MFs for the input variables is set as 3 and linear MF for the output variable is selected. Additionally, to indicate the nonlinear input and linear output parameters for training FIS, the hybrid algorithm is used as the optimized model with epoch 100 and zero tolerance. The performance of scenarios in the testing stage is evaluated by statistical metrics.
Overview of GEP
GEP ponders as a circulating and evolutionary intelligence algorithm introduced by Ferreira (2001) and derived from the Darwinian evolution concept with sufficient ability to predict elaborate relationships. Technically, in the present technique, the supreme population is carefully chosen; else, a fresh population is revived to reach the ideal population.
The creation of secluded items (expression tree and genome) with diverse applications yields the algorithm to adopt with great ability that meaningfully outdoes the present evolutionary methods.
The creation of the primary populace is the initial stage in GEP algorithm. This executes haphazardly or with some knowledge of the matter. Subsequently, the chromosomes are indicated in the structure of an expression tree. The consequences are assessed through a fitness function to specify the appropriateness of a resolution. By reaching a reasonable value, the progression procedure is discontinued and the excellent conclusion is reported. If stop situations are not fulfilled, the ideal one for the extant group is kept back. The procedure is repeated for a given number of generations so that an optimum outcome is acquired (Ferreira 2001).
GEP model development
In this investigation, GeneXpro Tools 4.0 program is employed in adopting the GEP model. The values of Cdstw are estimated by using GEP with the following steps (Mehdizadeh et al. 2017):
- 1.
In the first step, RMSE is selected as the fitness function.
- 2.
The second step is to determine variables and function sets to produce the chromosomes. Concerning, forecasters variables in Equation (6) (i.e., h1/Ww, bw/Lw, β, and λ) are selected as inputs variables, yet the target variable is Cdstw in Equation (6). The functions set comprises four primary mathematics operators { + , –, /, ×} and several mathematical functions, including x2, x3, ex, etc.
- 3.
In the third step, the core construction for chromosomes, such as head size, number of genes, and chromosomes, is outlined.
- 4.
In the fourth step, a linking function is employed to link expression trees and relate subcategories. In this direction, addition, subtraction, multiply, and division are tested.
- 5.
Finally, Maximum Fitness criteria, equal to 5,000, are specified as stop criteria.
In this study, by tuning the number of chromosomes (NC), kind of linking function (LF), and head size (HS) as hyperparameters, numerous scenarios are distinctly listed. It is necessary to mention that the ideal value of these hyperparameters is obtained through a trial-and-error method to accomplish an ideal GEP configuration.
PERFORMANCE EVALUATION METRICS
RESULTS AND DISCUSSION
Experimental results
Based on experimental results in β= 0, lowering of λ caused a decrease in the weir height and decreases Y1 accordingly. Nevertheless, the differences among h1 are very slight. Thus, for a given Q, h1 was almost constant by shifting λ. Apart from h1, by lowering λ, Y3 and Y2 increased. Furthermore, the ratio of increased to some extent by lowering λ. Its average for the streamlined weirs was approximately 0.75, while for circular-crested weirs, it was around 0.7 (Jaeger 1956). The reason was that, by lowering λ, the structure and streamline's curvature on the weir crest decreased (Bagheri & Kabiri-Samani 2020a). By lowering the streamline's curvature, the streamline's compression declined, and Y2 increased accordingly (Bagheri & Kabiri-Samani 2020a).
On the basis of experimental results at β= 30° and 60° (base-block), increasing the weir height corresponding to λ= 1 resulted in intensifying substantially the turbulence of the downstream weir flow. However, increasing the height of other ones with λ < 1 did not strikingly change the hydraulic manner of downstream weir flow (Bagheri & Kabiri-Samani 2020a). Besides, for streamlined weirs with small Q and λ < 1, the flow goes through a virtual surface into the downstream and an air pocket was involved below the lower nappe of flow profile (Bagheri & Kabiri-Samani 2020a). As Q rises, the air packet was annihilated and the weir base was successively submerged. A rotational flow region was developed near the weir base-block and consequently, a non-turbulent surface was generated. Interested readers can also refer to the previous study by Bagheri & Kabiri-Samani (2020a) for more details about the experimental results.
Validation of the RF model
Here, after several examinations, testing further numbers for variables and trees on each node showed that K = 500 leads to comparatively better results. The value of statistical indices for Cdstw in the calibration and validation stages under the optimal scenario of the RF model is given in Table 3.
Stages . | Statistical metrics (dimensionless) . | ||||
---|---|---|---|---|---|
R2 . | RMSE . | MAE . | MBE . | CC . | |
Calibration | 0.98 | 0.0175 | 0.0147 | 381 × 10−6 | 0.95 |
Validation | 0.96 | 0.0234 | 0.0192 | 0.0016 | 0.93 |
Stages . | Statistical metrics (dimensionless) . | ||||
---|---|---|---|---|---|
R2 . | RMSE . | MAE . | MBE . | CC . | |
Calibration | 0.98 | 0.0175 | 0.0147 | 381 × 10−6 | 0.95 |
Validation | 0.96 | 0.0234 | 0.0192 | 0.0016 | 0.93 |
The positive value of MBE signifies that the model overestimates the corresponding observed values. Also, it can be inferred that the RF model estimates Cdstw with high precision in the both calibration and validation stages, yet in the calibration stage, it is slightly more accurate than the validation stage.
Validation of the ANFIS model
Stages . | Statistical metrics (dimensionless) . | ||||
---|---|---|---|---|---|
R2 . | RMSE . | MAE . | MBE . | CC . | |
Calibration | 0.91 | 0.0306 | 0.0259 | 1.19 × 10−5 | 0.95 |
Validation | 0.79 | 0.0688 | 0.0544 | 0.0286 | 0.94 |
Stages . | Statistical metrics (dimensionless) . | ||||
---|---|---|---|---|---|
R2 . | RMSE . | MAE . | MBE . | CC . | |
Calibration | 0.91 | 0.0306 | 0.0259 | 1.19 × 10−5 | 0.95 |
Validation | 0.79 | 0.0688 | 0.0544 | 0.0286 | 0.94 |
Validation of the GEP model
Rates of parameters and genetic operators of GEP to estimate Cdstw under the optimal scenario are presented in Table 5. In effect, these parameters are the custom of GEP and have a perceptible influence on the ability of GEP.
Inversion 0.1 | Two-point recombination 0.3 |
Gene transposition 0.1 | |
Mutation 0.044 | Insertion sequence (IS) transposition 0.1 |
Gene recombination 0.1 | |
Number of genes 3 | |
One-point recombination 0.3 | Root insertion sequence (RIS) transposition 0.1 |
Inversion 0.1 | Two-point recombination 0.3 |
Gene transposition 0.1 | |
Mutation 0.044 | Insertion sequence (IS) transposition 0.1 |
Gene recombination 0.1 | |
Number of genes 3 | |
One-point recombination 0.3 | Root insertion sequence (RIS) transposition 0.1 |
Value of statistical metrics by GEP under the optimal scenario in the validation stage are given in Table 6, wherein, NC specifies the number of chromosomes, LF shows the kind of linking function, HS indicates head size, and the bold ones signify the optimal scenario's attributes.
. | . | . | Statistical metrics (dimensionless) . | |||
---|---|---|---|---|---|---|
LF . | NC . | HS . | RMSE . | R2 . | CC . | MAE . |
Addition | 30 | 8 | 0.18 | 0.55 | 0.74 | 0.17 |
Addition | 33 | 7 | 0.069 | 0.96 | 0.98 | 0.056 |
Addition | 35 | 6 | 0.036 | 0.92 | 0.96 | 0.029 |
Subtraction | 30 | 8 | 0.052 | 0.91 | 0.95 | 0.044 |
Subtraction | 33 | 7 | 0.13 | 0.46 | 0.68 | 0.107 |
Subtraction | 35 | 6 | 0.032 | 0.93 | 0.96 | 0.025 |
Multiplication | 30 | 8 | 0.033 | 0.91 | 0.95 | 0.024 |
Multiplication | 33 | 7 | 0.04 | 0.91 | 0.95 | 0.031 |
Multiplication | 35 | 6 | 0.032 | 0.97 | 0.96 | 0.024 |
Division | 30 | 8 | 0.051 | 0.88 | 0.93 | 0.04 |
Division | 33 | 7 | 0.04 | 0.91 | 0.95 | 0.033 |
Division | 35 | 6 | 0.039 | 0.97 | 0.98 | 0.032 |
. | . | . | Statistical metrics (dimensionless) . | |||
---|---|---|---|---|---|---|
LF . | NC . | HS . | RMSE . | R2 . | CC . | MAE . |
Addition | 30 | 8 | 0.18 | 0.55 | 0.74 | 0.17 |
Addition | 33 | 7 | 0.069 | 0.96 | 0.98 | 0.056 |
Addition | 35 | 6 | 0.036 | 0.92 | 0.96 | 0.029 |
Subtraction | 30 | 8 | 0.052 | 0.91 | 0.95 | 0.044 |
Subtraction | 33 | 7 | 0.13 | 0.46 | 0.68 | 0.107 |
Subtraction | 35 | 6 | 0.032 | 0.93 | 0.96 | 0.025 |
Multiplication | 30 | 8 | 0.033 | 0.91 | 0.95 | 0.024 |
Multiplication | 33 | 7 | 0.04 | 0.91 | 0.95 | 0.031 |
Multiplication | 35 | 6 | 0.032 | 0.97 | 0.96 | 0.024 |
Division | 30 | 8 | 0.051 | 0.88 | 0.93 | 0.04 |
Division | 33 | 7 | 0.04 | 0.91 | 0.95 | 0.033 |
Division | 35 | 6 | 0.039 | 0.97 | 0.98 | 0.032 |
PERFORMANCE COMPARISON OF THE METHODS DEVELOPED
TG obtained by Equations (19)–(20) is presented in Table 7.
Model . | R2 . | CC . | PI . | RMSE . | SG . | FG . | TG . |
---|---|---|---|---|---|---|---|
RF | 0.96 | 0.95 | 1.09 | 0.0175 | 19.79 | − 0.492 | 19.30 |
ANFIS | 0.79 | 0.94 | 1.024 | 0.068 | 17.94 | − 10.76 | 7.17 |
GEP | 0.97 | 0.96 | 1.07 | 0.032 | 20 | 0 | 20 |
Model . | R2 . | CC . | PI . | RMSE . | SG . | FG . | TG . |
---|---|---|---|---|---|---|---|
RF | 0.96 | 0.95 | 1.09 | 0.0175 | 19.79 | − 0.492 | 19.30 |
ANFIS | 0.79 | 0.94 | 1.024 | 0.068 | 17.94 | − 10.76 | 7.17 |
GEP | 0.97 | 0.96 | 1.07 | 0.032 | 20 | 0 | 20 |
In relation to the values of TG in Table 7, the GEP model is selected as the superior approach for the prediction of Cdstw. The results obtained by RF are the second best, which indicates that the GEP model outperforms the other two models and is considered the most accurate method.
CONCLUSION
In the present research, experimental data of streamlined weirs with different β values of 0°, 30°, and 60° from the study of Bagheri & Kabiri-Samani (2020a) were employed for investigation. The experimental setup was performed for large physical models under steady, aerated, and free overflow conditions in an open channel. As a substitute to the CFD technique to forecast Cdstw, the potential advantage of three different DDMs including RF, ANFIS, and GEP methods are developed in diverse geometric and hydraulic conditions. Main findings of the present study are as follows:
Based on the experimental results at β= 0, lowering of λ led to a decrease in the weir height and Y1, but an increase in Y3, Y2, and in the ratio of Y2/h1. Moreover, at β= 30° and 60° (base-block), increasing the weir elevation in λ= 1, the disturbance augmented considerably for the flow downstream of the streamlined weir, but, for λ < 1, did not demonstrably vary the hydraulic condition of flow in the downstream of the weir.
Using Buckingham π-theorem and cosine amplitude (Rij) analyses as a preprocessing method confirmed that the h1/Ww, bw/Lw, β, and λ, have significant impact on Cdstw and consequently were considered as input variables in estimating Cdstw by developed DDMs, in which bw/Lw was the most significant one.
Performances of the three employed models were evaluated using statistical metrics and model scoring procedure. In line with the values of model grading, the GEP model was confirmed as the most superior and precise technique to compute Cdstw with RMSE = 0.032, MAE = 0.024, R2 = 0.97, and CC = 0.96.
Even though the current investigation assessed the ability of a single AI method for predicting Cdstw, the forthcoming study can be developed by other kinds of MLMs and IOMs via hybridizing approaches. One may note that the application of successful surrogate modeling methods like polynomial chaos expansion/Kriging in other fields of engineering (Amini et al. 2021; Hariri-Ardebili et al. 2021) can be investigated in future work. The results can be compared with those of the current study so that the best method can be identified. Likewise, even if in the current research all effective variables on the Cdstw were scrutinized, its outcomes cannot be expanded to other structures.
ACKNOWLEDGEMENTS
We are grateful to the Research Council of Shahid Chamran University of Ahvaz for financial support (GN: SCU.WH1401.7209).
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.