The current paper discussed the application and comparison of machine learning algorithms such as the gradient boosting machine (GBM), neural network (NN), and deep neural network (DNN) in estimating the oxygen aeration performance efficiency (OAPE20) of the gabion spillways. Besides, traditional equations, namely developed multivariable linear regression (MLR) and multivariable nonlinear regression (MNLR) along with the previous models were also employed in estimating OAPE20 of the gabion spillways. Results in the testing phase showed that the DNN with the highest value of correlation (correlation of coefficient (CC) = 0.9713) and lowest values of errors (root mean square error (RMSE) = 0.1684, mean squared error (MSE) = 0.0283, and mean absolute error (MAE) = 0.1532) demonstrated the best results in estimating OAPE20 of the gabion spillways; however, other applied models such as GBM, NN, MLR, and MNLR were giving comparable results evaluated to statistical appraisal metrics, but previous studies were performing incredibly poor with the lowest value of correlation and highest values of errors. The datasets employed here were collected by conducting experiments. From the relative significance of input parameters, the Reynolds number (Re) was observed to be a crucial parameter. At the same time, the ratio of the mean size gabion materials to the length of the gabion spillway (d50/L) had the least impact over the OAPE20 of the gabion spillways.

  • The test for the aeration performance efficiency of gabion spillways was studied.

  • Machine learning techniques were used for estimating the gabion spillway aeration efficiency.

  • The estimating potential of DNN, GBM, NN, etc., was compared.

  • The DNN model outperformed the other proposed models.

  • A sensitivity test was conducted to know the relative impact of the input variable on the output results.

Graphical Abstract

Graphical Abstract
Graphical Abstract

Dissolved oxygen (DO) concentration is essential for determining the health of water bodies. The oxygen content of water is depleted by naturally occurring biological and chemical processes, which puts more stress on aquatic life in bodies of water and reduces DO levels (Baylar & Emiroglu 2004). Reaeration increases the oxygen in the water by drawing air from the surrounding atmosphere. However, for the drop structures like weirs, aeration is connected to form, roughness, plunging velocity, and geometry (Luxmi et al. 2022). By altering the flow using fluidic structures such as hydraulic jumps and hydraulic drops, it is possible to improve aeration.

Hydraulic structures enhance the amount of DO in a stream by self-aeration, even though the water is in contact with the structure for only a fraction of the time (Baylar et al. 2011). The primary cause of this faster oxygen transfer is the extraction of air and its penetration into the flow in the form of an enormous number of tiny bubbles. These micro air bubbles improve the surface area accessible for oxygen transfer, facilitating greater oxygen exchange (Baylar & Bagatur 2000). Gabion spillways are widely employed in an earthen dam, preservation of soil work, retaining structures, river-training work at the bend, and other projects. The gabion is stable, flexible, and simple to construct without losing structural integrity.

Furthermore, they can withstand considerable differential settlements, if any. If materials that may be found locally are readily accessible in large quantities, gabion structures will be a cost-effective solution. It is made up of a porous substance surrounded by a grid of metal wires and filled with coarser materials of various sizes and shapes. Its porosity assists in water drainage, lowering the water load behind the building. It is also possible to design and use stepped spillways in the gabion type.

Additionally, the flow over the steps cascades down the spillway face, causing aeration (Salmasi et al. 2012). Two types of flow over the stepped spillways are defined by Chanson (2002). They are (a) aerated flow and (b) non-aerated flow. Gabions can reduce water pressure by preserving rocks' permeability and flexibility (Aal et al. 2019). Zhang & Chanson (2016) classified flow over stepped spillways into three hydraulic regimes: skimming flow, nappe flow, and transition flow. The research concluded that a stepped weir gives good results compared to a flat one (Wuthrich & Chanson 2015). Wormleaton & Soufiani (1998) studied triangular labyrinth weirs and rectangular labyrinth weirs' oxygen aeration potential. In Africa, Sahel gabion spillways are the most normal spillway (Peyras et al. 1992). Kells (1994) addressed how discharge over the crest and critical depth relates to energy dissipation over gabion-stepped weirs. Chinnarasri et al. (2008) performed experimental research on the sensitivity of hydraulic performance and the properties of the material contained in gabions. The gabion spillway will also produce turbulence, enhancing aeration and enabling the organic materials' aerobic decomposition. It will also help aquatic life to move and migrate more easily (Luxmi et al. 2022). Soft computing data-driven techniques are applied to simulate the effectiveness of oxygen aeration at barrages, spillways, gabion spillways, Parshall flumes, Montana flumes, etc. Soft computing approaches are widely utilized since they are available, have built-in intelligence, and are dependable. In addition, they lessen the scale effect and prevent model creation. In the fields of hydraulic engineering and water resources, a wide variety of soft computing approaches have been used. Researchers have recently expressed interest in applying machine learning (ML)/soft computing approaches to forecast the aeration effectiveness of gabion weirs with and without steps (Luxmi et al. 2022; Srinivas & Tiwari 2022; Verma et al. 2022). Baylar et al. (2011) also studied the prediction of oxygen transfer efficiency in aeration stepped cascades using gene expression programming (GEP) modeling. Aeration efficiency at the Labyrinth Weir was investigated experimentally and through modeling (Singh et al. 2021). Kumar et al’s. (2022) work involves stimulating the oxygen transfer abilities to plunge jets using fundamental flow patterns or flow characteristics and forecasting the volume of oxygen transfer coefficient using modeling approaches such as artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), multivariate adaptive regression splines (MARS), MNLR, and generalised regression neural network (GRNN). Kumar et al. (2022) aim to investigate the impact of jet velocity, jet length, water depth, and jet thickness of plunging hollow jets in oxygenating the water in the aeration tank. Empirical correlations are proposed for the determination of the volumetric oxygen transfer coefficient, KLA, based on jet velocities and jet kinetic powers.

Theoretical background

Oxygen transfer process

A small amount of water that goes through a hydraulic device (fluidic device) changes its oxygen intensity over time as it travels through the structure (Lewis & Whitman 1924).
(1)
where KL is the coefficient of bulk liquid film for oxygen, IS is the saturation intensity of DO in water, I is the intensity of DO, A is the area of surface associated with the volume V, over which transfer occurs, and t is the time.
By considering IS as constant, the oxygen aeration efficiency is
(2)

The OAPE is the oxygen aeration performance efficiency, Id is the intensity of DO downstream of the fluidic device, Iu is the intensity of DO upstream of the fluidic device, and IS is the saturation intensity of DO. When the downstream water is supersaturated (IS < Id), then the OAPE > 1. Similarly, when full oxygen transfer reaches saturation value, then OAE = 1 and OAPE = 0 represent no oxygen transfer.

The oxygen transfer efficiency is usually normalized to 20 °C standards for comparing results consistently. Gulliver et al. (1990) presented an equation to illustrate the impact of temperature as
(3)
The OAPE represents the OAPE at actual water temperature, the OAPE20 represents the OAPE at 20 °C, and f represents the exponent of the gain described subsequently.
(4)
For hydraulic construction, the overall oxygen transfer may be calibrated by the deficit ratio, ‘r’, which Markofsky & Kobus (1978) described as
(5)

Importance, aim, objectives, and novelty

The importance of the present study was to forecast and examine the usefulness of the gabion spillways by estimating the oxygenation aeration performance efficiency (OAPE20) between the interface of air and water surface overflow and seepage through the previous spillway body. The novelty of the present work had many dimensions, as the current study outlined the evaluation of the OAPE20 by conducting experimental tests varying with dimensionless parameters, such as the Reynolds number (Re), Froude number (Fr), the ratio of the gabion means size particle to its length (d50/L), and porosity (n) of the gabion materials. Second, the OAPE20 was estimated and compared with ML or soft computing models; deep neural network (DNN), neural network (NN), and gradient boosting machine (GBM) utilizing observed datasets. Third, these estimated values of the OAPE20 were further compared with developed multivariable linear regression (MLR) and multivariable nonlinear regression (MNLR) and proposed previous relations. Finally, a sensitivity study was also computed to have the relative impact of input parameters on the outcomes of the OAPE20.

Proposed modeling techniques

The present study used experimental data to estimate the aeration performance of gabion spillways utilizing traditional approaches such as MLR and MNLR, and proposed existing empirical relations and ML techniques such as NN, GBM, and DNN.

Traditional models

Two regression models were employed in the current investigation to estimate the oxygen aeration performance efficiency (OAPE20). One version of the regression equation uses MLR and the other uses MNLR.

The relationship between a secondary variable (X) and primary variables (r1, r2, r3, …) was established using an MLR model or equation. The MLR model's attributes were organized as follows:
(6)

in which p1, p2, p3, … are proportionality constants.

The relation using MLR was found to be as follows:
(7)
The relation between a secondary variable (X) and primary variables (Y1, Y2, Y3,…) was developed by using an MNLR. An MNLR model was used for the multiple prediction variables. The characteristics layout of the MNLR model was
(8)
X was the secondary variable and considered as the output variable; p was the proportionality constant, Y1, Y2Yn were the primary variables selected as input parameters, and K1, K2, K3Kn were constants of exponential. The relation was found through MNLR as follows:
(9)
in which OAPE20 is the oxygen aeration performance efficiency at 20 °C, Re is the Reynolds number, Fr is the Froude number, d50 is the mean size of the materials used in the gabion, spillway in cm, and n is the porosity %. Table 1 shows the proposed derived and existing models available in the text.
Table 1

Proposed conventional models of various hydraulic devices used for the OAPE20

Sr. No.Model originModelComments
Tiwari (2021)   Oxygen aeration efficiency for hydraulic jump 
Luxmi et al. (2022)   Oxygen aeration efficiency for gabion weir 
MLR  Present study 
MNLR  Present study 
Sr. No.Model originModelComments
Tiwari (2021)   Oxygen aeration efficiency for hydraulic jump 
Luxmi et al. (2022)   Oxygen aeration efficiency for gabion weir 
MLR  Present study 
MNLR  Present study 

Gradient boosting machine

Freund & Schapire (1997) states that the ML community first developed boosting algorithms. With GBM learning, weak learners combine in different ways to create strong ones. A new model is fitted when each weak model is added to give a more precise estimate of the response variable. The negative gradient of the function linked to the entire ensemble and the new weak learners are most strongly coupled. The GBM's primary goal is to produce a better prediction model by combining several relatively weak models. The structure of GBM is shown in Figure 1(a). By keeping the mean squared error (MSE) within the parameters specified by Equation (10), the models predict values for the structure .
(10)
where is the estimated value, is the observed value, i is the equities over some test data of size, and n is the amount of data in y.
Figure 1

Structures of the (a) GBM, (b) NN and (c) DNN.

Figure 1

Structures of the (a) GBM, (b) NN and (c) DNN.

Close modal

NN and DNN

Machine learning includes the NN and the DNN. The NN and the DNN are made up of neurons that resemble those in the nervous system. Neurons are given biases and weights. These neural networks are built to function similarly to how neurons in the human brain do Fischer (1998). A brain neuron works by taking in information and then generating an output utilized by another cell. The NN security training also works on a similar pattern. They stimulate behavior by learning about the collected data and predicting outcomes (Nigrin 1993).

The NN employs a massive number of highly interconnected nodes (neurons) that work to solve specific problems, such as forecast and pattern classification (Bishop 1995). The NN is widely used to solve water resource problems and is a very popular soft computing technique. The NN and DNN include an input, a hidden, and an output layer. Nevertheless, the basic difference between the NN and DNN is that several hidden layers are present, and also several nodes are relatively more in the DNN than NN. In the case of a conventional NN, only one hidden layer is present, but in the DNN, the number of hidden layers is more than one. That is why the DNN is called a deep neural network. In the current study, the NN model was developed using MATLAB software, while the model of the DNN was developed using H2O software. The structures of the NN and the DNN are shown, respectively, in Figure 1(b) and 1(c).

Experimental program

The experiments were carried out in the hydraulic laboratory of the civil engineering department at the National Institute of Technology, Kurukshetra, Haryana, India. A rectangular rigid steel flume with a width of 25 cm, a height of 30 cm, and a span of 4 m is installed. A transparent acrylic sheet of 1.8 m length was located in the middle of the flume on both sides of the walls. A schematized view of the experimental setup is shown in Figure 2. A 2-HP motor pump installed with a channel has a maximum flow rate of 5.2 L/s. The flume was designed with a re-circulated closed system that continuously replenishes the channel by redrawing water from a rectangular storage cum aeration tank 87 cm long, 87 cm wide, and 90 cm deep. The flow rate was measured with a digital flow meter. The downstream end of the channel had the model installed. The water depth in the channel was calculated with a pointer gauge. Cobalt chlorides and sodium sulfite were combined in a calibrated amount to keep the DO content of water between 1 and 2 mg/L (Emiroglu et al. 2003; Tiwari & Sihag 2020; Tiwari 2021). From the chemical mixing tank, water entered the channel through the headbox containing a screen, which dampened the eddies at the entry of the head box, if any. The water inflow was controlled by a regulator fitted in the pipe, as shown in Figure 2.
Figure 2

A schematic view of the test setup.

Figure 2

A schematic view of the test setup.

Close modal

The water's temperature was measured manually using a thermometer with an accuracy of 0.5 °C in the range of −10 to 50 °C, and the DO was measured using the Azide-modification methods (Winkler method, Raikar & Kamatagi 2015). Configuration and dimension of the gabion spillway models are shown in Table 1. In an experiment, gabion spillway models with the appropriate fixing arrangement were installed at the downstream end of the channel. Underflow rates ranging from 0.5 to 5.2 L/s were used to test each gabion spillway. The porosity (n), Reynolds number (Re), Froude number (Fr), and the ratio of gabion mean sizes to the length of the gabion spillway (d50/L) were calculated from the experimental datasets. See Table 2.

Methodology

Water was mixed with sodium sulfite (Na2SO3) and cobalt chloride (COCl2) in the proper amounts to lower the oxygen level between 1 and 2 mg/L, which was used to compute the DO level in each run. Water samples were analyzed using the azide-modification method after the flume had run for 85 s (Kumar et al. 2021). Care was also made to prevent water from coming into contact with air, except for models where air mixing happens due to gravity.

The model performance metrics of the proposed models were analyzed by statistical measures, i.e., the correlation of coefficient (CC) and root mean square error (RMSE), MSE, and mean absolute error (MAE).

CC: The mathematical equation of CC can be expressed as:
(11)
RMSE: The formula or equation of the RMSE can be expressed as:
(12)
MSE: The MSE equation can be expressed as the following relation given subsequently:
(13)
MAE: The equation of the MAE is given subsequently in the following relation:
(14)
where is the observed oxygen aeration performance efficiency at 20 °C, is the estimated oxygen aeration performance efficiency at 20 °C, and N is the total number of observations.
Table 2

Description of models

ParametersNotationsValueRange
Units
FromTo
Gabion spillway height P 20 20 20 cm 
Gabion spillway width B 40 40 40 cm 
Gabion spillway length L 25 25 25 25 
Gabion mean size d50 26.4, 29.4, and 49.20 26.4 49.20 mm 
Porosity n 24, 44, and 57 24 57 
ParametersNotationsValueRange
Units
FromTo
Gabion spillway height P 20 20 20 cm 
Gabion spillway width B 40 40 40 cm 
Gabion spillway length L 25 25 25 25 
Gabion mean size d50 26.4, 29.4, and 49.20 26.4 49.20 mm 
Porosity n 24, 44, and 57 24 57 

Datasets

A total number of 161 experimental readings were utilized for making the model. The input datasets comprise the Reynolds number (Re), mean sizes of gabion (d50), Froude number (Fr), the ratio of mean sizes of gabion to the length of gabion spillway (d50/L), porosity(n), and output data were the oxygen aeration performance efficiency (OAPE20). For training, 75% of the total data were taken, and the net left out 25% of the total datasets were utilized for testing. The statistical summary details of training and testing are shown in Table 3.

Table 3

Statistical summary for training and test data

VariablesMiniMaxMeanStd.KurtosisSkewness
Train 
Re 2000 20,400 12836.67 5869.848 −1.1799 −0.4612 
Fr 7.504 22.686 11.112 3.470 2.5780 1.6542 
 0.04 0.314 0.138 0.0725 0.1778 0.7393 
24 57 41.368 8.084 0.1666 0.2725 
OAPE20 0.013 0.559 0.341 0.116 0.0513 −0.6163 
Test 
Re 3200 20,400 13,380 6209.96 −1.3793 −0.4666 
Fr 7.595 3200 10.583 2.514 −0.5271 0.7879 
 0.04 0.314 0.1266 0.0684 1.0998 1.0918 
24 57 39.821 7.269 1.0839 0.1998 
OAPE20 0.039 0.584 0.321 0.154 −0.9897 −0.2886 
VariablesMiniMaxMeanStd.KurtosisSkewness
Train 
Re 2000 20,400 12836.67 5869.848 −1.1799 −0.4612 
Fr 7.504 22.686 11.112 3.470 2.5780 1.6542 
 0.04 0.314 0.138 0.0725 0.1778 0.7393 
24 57 41.368 8.084 0.1666 0.2725 
OAPE20 0.013 0.559 0.341 0.116 0.0513 −0.6163 
Test 
Re 3200 20,400 13,380 6209.96 −1.3793 −0.4666 
Fr 7.595 3200 10.583 2.514 −0.5271 0.7879 
 0.04 0.314 0.1266 0.0684 1.0998 1.0918 
24 57 39.821 7.269 1.0839 0.1998 
OAPE20 0.039 0.584 0.321 0.154 −0.9897 −0.2886 

The dataset was obtained from experimental observations, and proposed traditional methods (MLR and MNLR), existing predictive equations, and soft computing or ML techniques (GBM, NN, and DNN) are used as modeling techniques.

The GBM model

The GBM model was developed using free auto-ML H2O software. Many models were developed by changing the percentage of the division of training data (calibrating data) and testing data (validating data). A total number of 161 datasets were used for modeling. Finally, 75% of training data (124) and 25% of testing data (37) were found suitable for estimating the best model. A new model was fitted when each weak learner was added to give a more precise estimate of the response variable. The negative gradient of the function linked to the entire ensemble and the new weak learners were most strongly coupled. The GBM's primary goal was to produce a better prediction model by combining several relatively weak prediction models. These were essential in estimating accurate values by considering the minimum reckoning cost. The calibrating (training) data dataset was split into five folds, each comprising 25 data, whereas testing (validating) data was equally split into five folds, each comprising secen data. The principal parameters were tuned by methods of hit-and-trial, as shown in Table 4.

Table 4

Values/type of optimized model parameters used in the GBM

Sr. No.GBM parameters nameValue/type
N folds 
Fold assignment Modulo 
ntree 98 
Max depth 
Distribution Gaussian 
Categorical encoding Enum 
Column sample rate 0.4 
Row sample per tree 0.6 
Sr. No.GBM parameters nameValue/type
N folds 
Fold assignment Modulo 
ntree 98 
Max depth 
Distribution Gaussian 
Categorical encoding Enum 
Column sample rate 0.4 
Row sample per tree 0.6 

Figure 3(a) represents the variation in the deviation of the outcomes for training and testing datasets versus the number of trees (ntrees). When carefully noticed, it was found that when the value of ntrees tends to 98, the deviance became asymptotic to the abscissa (x-axis). However, Figure 3(b) shows the scatter plot between observed and predicted OAPE by the GBM of the gabion spillways for training and testing datasets. From the perusal of Figure 3(b), it was clear that all estimated values, either in testing or training, were lying along the perfect line, which implied that the GBM model was performing well. Furthermore, by observing Table 5, the above contention was again buttressed that GBM was performing well and could be used in the estimation of the OAPE20 for the gabion spillway as the value of CC was higher and error values were smaller.
Table 5

Performance evaluation with proposed techniques

Proposed approachesCCRMSEMSEMAE
Train data 
Tiwari (2021)  0.7599 0.9157 0.8386 0.9139 
Luxmi et al. (2022)  0.5147 0.7241 0.5243 0.7065 
MLR 0.8616 0.2228 0.0496 0.2093 
MNLR 0.8370 0.4649 0.2162 0.0588 
NN 0.9155 0.1753 0.0307 0.1559 
GBM 0.9833 0.1383 0.0191 0.1258 
DNN 0.9744 0.1430 0.02040 0.1297 
Test data 
Tiwari (2021)  0.9054 0.9270 0.86111 0.1460 
Luxmi et al. (2022)  0.4477 0.7160 0.5127 0.6962 
MLR 0.9253 0.1197 0.01433 0.0189 
MNLR 0.9045 0.3106 0.06953 0.2624 
NN 0.9368 0.1867 0.0348 0.1639 
GBM 0.93167 0.1850 0.0342 0.1665 
DNN 0.9713 0.1684 0.0283 0.1532 
Proposed approachesCCRMSEMSEMAE
Train data 
Tiwari (2021)  0.7599 0.9157 0.8386 0.9139 
Luxmi et al. (2022)  0.5147 0.7241 0.5243 0.7065 
MLR 0.8616 0.2228 0.0496 0.2093 
MNLR 0.8370 0.4649 0.2162 0.0588 
NN 0.9155 0.1753 0.0307 0.1559 
GBM 0.9833 0.1383 0.0191 0.1258 
DNN 0.9744 0.1430 0.02040 0.1297 
Test data 
Tiwari (2021)  0.9054 0.9270 0.86111 0.1460 
Luxmi et al. (2022)  0.4477 0.7160 0.5127 0.6962 
MLR 0.9253 0.1197 0.01433 0.0189 
MNLR 0.9045 0.3106 0.06953 0.2624 
NN 0.9368 0.1867 0.0348 0.1639 
GBM 0.93167 0.1850 0.0342 0.1665 
DNN 0.9713 0.1684 0.0283 0.1532 
Figure 3

(a) Scoring deviance and (b) performance of the GBM.

Figure 3

(a) Scoring deviance and (b) performance of the GBM.

Close modal

The NN model

In the present study for estimating oxygen aeration performance efficiency (OAPE20), the NN approach was considered in forecasting the model. The NN consists of a manifold layer and every layer has nodes (neurons). The layer was joined with a weighted connection (coefficients of weights). Usually, three categories of the layer are formed in the artificial NN: the first layer signifies (signal) input, the hidden (middle) layer for evaluating input weights, and the output layer is the final layer. In three stages, the NN was established: in the initial stage, training data was prepared, and the second stage needed various positioning and assembly of effective network architectures. Furthermore, the final stage was testing. The number of neurons and hidden layers were selected using hit-and-trial methods, which estimated the desired results. The optimal principal parameters of the NN are established by the trial-and-error method, as shown in Table 6. Figure 4 represents the NN-based model scattered plot between the observed OAPE20 and its estimated values for training and testing datasets. It was observed that barring some estimated points for testing datasets, all estimated points were lying near the perfect line. By observing Table 5, it was further evidence that NN was performing well and could be used in estimating the OAPE20 for the gabion spillway as the value of CC was more significant and error values were lesser.
Table 6

Types of the model parameter used in the NN

Sr. No.NN parameters nameValue/type
No of nodes 
Epochs 245 
Hidden layer 
Models type Bayesian regularization (trainbr) 
Sr. No.NN parameters nameValue/type
No of nodes 
Epochs 245 
Hidden layer 
Models type Bayesian regularization (trainbr) 
Figure 4

Performance of the NN.

Figure 4

Performance of the NN.

Close modal

The DNN model

The DNN model was also developed using the free auto-ML H2O software. Several models were created by changing the ratio of the division between training data and testing datasets. For modeling, a total of 161 datasets were employed. Finally, 25% of the testing data (40) and 75% of the training data (121) are considered appropriate for predicting the best model. In a DNN, the initial stage was to find the number of epochs. These were essential in forecasting accurate values by considering minimum reckoning costs. The dataset of training records was split into five folds, each comprising 25 data, whereas testing data were split into five folds, each comprising eight data. The optimized key parameters were tuned by methods of trial-and-error, as shown in Table 7.

Table 7

Values/types of model parameters used in the DNN

Sr. No.DNN parameters nameValue/type
N folds 
Fold assignment Module 
Response column (output variable) OAPE20 
Activation Rectifier with dropout 
Hidden layers 100,100,100 
Epochs 4,000 
Distribution Gaussian 
Categorical encoding One hot internal 
Sr. No.DNN parameters nameValue/type
N folds 
Fold assignment Module 
Response column (output variable) OAPE20 
Activation Rectifier with dropout 
Hidden layers 100,100,100 
Epochs 4,000 
Distribution Gaussian 
Categorical encoding One hot internal 

Figure 5(a) represents the variation in the deviation of the outcomes of the OAPE 20 for both training and testing datasets versus the epochs. When carefully noticed that it was found that when the value of epochs tends to 4,000, the deviation acquired asymptotic to the x-axis. However, Figure 5(b) shows the scatter plot between observed and estimated OAPE by the DNN of the gabion spillways for training and testing datasets. From the perusal of Figure 5(b), it is clear that all estimated values, either in testing or training, lie along the perfect line, implying that the DNN was the most performing model. This contention was further substantiated by observing Table 5, where the value of CC was highest, and error values were the lowest among all proposed models.
Figure 5

(a) Scoring deviance and (b) performance of the DNN.

Figure 5

(a) Scoring deviance and (b) performance of the DNN.

Close modal

The MLR, MNLR, and published traditional models

The multiple variable linear regression (MLR) and multiple variable nonlinear regression (MNLR) models were developed by using XLSTAT software which was shown in Equations (7,9). Figure 6 represents an agreement diagram between the estimated results of MLR, MNLR, Luxmi et al. (2022), and Tiwari (2021) with the observed results of the OAPE20 for training and testing datasets. From Figure 6, it was observed that MLR and MNLR have shown the best results compared to Luxmi et al. (2022) and Tiwari (2021) as estimated points by MLR and MNLR lie near to perfect line, while Luxmi et al. (2022) model was underestimated as estimated data points lie below the perfect line and Tiwari (2021) model was overestimated as estimated data points lie above the perfect line for both training and testing conditions. Beside, observing Table 5, it is again clear that both MLR and MNLR models had fewer errors compared to Luxmi et al. (2022) and Tiwari (2021) models. However, the MLR model with a higher value of CC = 0.9253 and lower values of RMSE = 0.1197, MSE = 0.01433, and MAE = 0.0189 is performing better than MNLR with a lower value of CC = 0.9045 and higher values of RMSE = 0.3106, MSE = 0.06953 and MAE = 0.2624.
Figure 6

Performance of MLR, MNLR, and other traditional models.

Figure 6

Performance of MLR, MNLR, and other traditional models.

Close modal

The comparison

The models developed using datasets of gabion spillways were compared using appraisal parameters shown in Table 5. All the proposed models, namely MLR, MNLR, GBM, NN, and DNN models are efficiently predicted the OAPE20 of the gabion spillways. However, the DNN model outperformed all the proposed models in estimating the OAPE of the gabion spillways. The DNN model has the highest CC value and lowest error values, as shown in Table 5. However, all proposed soft computing models performed better than traditional models. However, both the MLR and MNLR models performed well and could be utilized in estimating the OPAE20.

Furthermore, Figure 7 represented an agreement diagram between observed and estimated values of the gabion spillways' OAPE using soft computing models; GBM, NN, and DNN. It could be observed from Figure 7 that, by and large, the majority of estimated results for the OAPE20 fall around the perfect line. Four more error lines in the domain of ±25 and ±10%. were also drawn between the estimated and observed values of the OAPE20 of the gabion spillways. Figure 7 showed that most of the estimated values of the OAE20 by NN and DNN were lying well within the ±10% error line from the perfect agreement line in both training and testing cases, but some values of the GBM model were lying beyond the ±25%. So, barring some estimated points, all estimated values by the soft computing algorithms lie in the range of ±25% error lines for the training and testing dataset. Furthermore, it could be drawn the inference from Figure 7 that for dimensionless datasets, DNN and NN are performing well as their estimated points lie within the ±10% error band for both training and testing datasets; however, all other considered ML models give values that lie in the range ±25% error band.
Figure 7

Performance of all proposed machine learning models.

Figure 7

Performance of all proposed machine learning models.

Close modal

The above contention was further corroborated by Figure 7, where it is clear that predicted values by the DNN model were lying near the observed values, followed by NN and GBM models. Besides, Table 5 also further substantiated that the value of the correlation coefficient (CC = 0.9713) was the highest, and the value of errors (RMSE = 0.1684, MSE = 0.06953, and MAE = 0.1532 were lowest for the DNN model which shows the best-performing model followed by the NN model. Nevertheless, the GBM model performed at par with the proposed ML-based models. But, in the case of traditional models, both MLR and MVLR models were giving comparable performance but the proposed previously existing models (Tiwari 2021; Luxmi et al. 2022) were performing very poorly for training and testing datasets. The summary statistics of predicted results by all proposed models are presented in Table 8 for the training and testing dataset.

Table 8

Summary details of estimated values of the OPAE20

ModelsMinMaxMeanStd.KurtosisSkewness
Train data 
Tiwari (2021)  0.7911 1.4801 1.180 0.1573 −0.06329 −0.7445 
Luxmi et al. (2022)  0.3544 1.3216 0.8660 0.2517 −0.9375 −0.1427 
MLR 0.1251 0.5265 0.3420 0.1004 −1.0333 −0.30653 
MNLR 0.1696 0.4674 0.3474 0.0672 −0.1500 −0.81250 
NN 0.0224 0.5163 0.3447 0.1047 0.2540 −0.7472 
GBM 0.0130 0.5840 0.3302 0.1294 −0.2996 −0.5611 
DNN 0.0310 0.5284 0.3363 0.1119 0.05137 −0.61634 
Test data 
Tiwari (2021)  0.8270 1.4706 1.1825 0.1782 −0.4004 −0.6294 
Luxmi et al. (2022)  0.4340 1.3249 0.8341 0.2227 −0.2332 0.3886 
MLR 0.1338 0.5737 0.3357 0.1239 −1.1078 −0.09865 
MNLR 0.2077 0.4867 0.3448 0.0739 −0.7030 −0.3374 
NN 0.0887 0.5394 0.3234 0.1364 −1.2097 −0.2527 
GBM 0.0617 0.599 0.3521 0.1293 −0.4030 −0.7237 
DNN 0.0668 0.5426 0.3195 0.1446 −1.1685 −0.2935 
ModelsMinMaxMeanStd.KurtosisSkewness
Train data 
Tiwari (2021)  0.7911 1.4801 1.180 0.1573 −0.06329 −0.7445 
Luxmi et al. (2022)  0.3544 1.3216 0.8660 0.2517 −0.9375 −0.1427 
MLR 0.1251 0.5265 0.3420 0.1004 −1.0333 −0.30653 
MNLR 0.1696 0.4674 0.3474 0.0672 −0.1500 −0.81250 
NN 0.0224 0.5163 0.3447 0.1047 0.2540 −0.7472 
GBM 0.0130 0.5840 0.3302 0.1294 −0.2996 −0.5611 
DNN 0.0310 0.5284 0.3363 0.1119 0.05137 −0.61634 
Test data 
Tiwari (2021)  0.8270 1.4706 1.1825 0.1782 −0.4004 −0.6294 
Luxmi et al. (2022)  0.4340 1.3249 0.8341 0.2227 −0.2332 0.3886 
MLR 0.1338 0.5737 0.3357 0.1239 −1.1078 −0.09865 
MNLR 0.2077 0.4867 0.3448 0.0739 −0.7030 −0.3374 
NN 0.0887 0.5394 0.3234 0.1364 −1.2097 −0.2527 
GBM 0.0617 0.599 0.3521 0.1293 −0.4030 −0.7237 
DNN 0.0668 0.5426 0.3195 0.1446 −1.1685 −0.2935 

Sensitivity study

The independent input parameters of the relative importance are presented in Figure 8 using the best-performing DNN model. The most sensitive parameter was found to be the Reynolds number (Re) as it reads out a maximum of 1 on-scale importance and the ratio of mean size gabion materials to the length of the gabion spillway (d50/L) proved to be the least sensitive parameter since it read out a minimum of 0.78 on scaled Importance.
Figure 8

Sensitivity study of input parameters.

Figure 8

Sensitivity study of input parameters.

Close modal

In the present study, the modeling of OAPE of gabion spillways (OAPE20) was investigated by traditional methods, including MLR and MNLR, and considered existing empirical relations and proposed ML techniques, namely GBM, NN, and DNN using an experimental dataset. From the above works, the following key conclusions were drawn.

The performance evaluation of the three ML techniques was carried out based on statistical indices like the CC, RMSE, MSE, and MAE. These three ML techniques models utilized experimental datasets to estimate the OAPE20 of the gabion spillways. Out of these three ML models, it was observed that the DNN model was the best performing in both training and testing as the highest values of CC = 0.9744, and lowest values of RMSE = 0.1430, MSE = 0.02040, and MAE = 0.1297 for the training and highest value CC = 0.9713, and lowest values of RMSE = 0.1684, MSE = 0.0283, and MAE = 0.1532 for the testing in comparison to other proposed models.

  • This study further presented that the NN model (CC = 0.9368, RMSE = 0.1867, MSE = 0.0348, and MSE = 0.1639 in testing) had sufficient potential for estimating gabion spillway OAPE. It was the second-best-performing model after the DNN, in which the number of neurons in the hidden layer was found to be more sensitive, and its optimum value was eight.

  • The GBM model with metrics of CC = 0.93167, RMSE = 0.1850, MSE = 0.0342, and MSE = 0.1665 in testing could be used to estimate the OAPE20 of the gabion spillways but was found to be the least-performing model compared to the other proposed ML-based models.

  • Both MLR and MNLR models were also executing well, but the MLR with CC = 0.9253, in testing, performed better than the MNLR with a CC value of 0.9045 in the estimating gabion spillways oxygen aeration performance efficiency (OAPE20). However, the proposed previous models are performing poorly due to high errors and small correlation values for these datasets.

  • The sensitivity study suggests that the Reynolds number (Re) was the most sensitive. At the same time, the ratio of the mean size gabion material to the length of the gabion spillway (d50/L) is the least sensitive parameter.

R.S. is thankful to the Ministry of Human Resources, Government of India, and the Director, National Institute of Technology Kurukshetra (Haryana) for the monetary support of the present work for the Master Degree (MTech) scholarship (32012514).

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Aal
G. M. A.
,
Fahmy
M. R.
,
Elznikhely
E. A.
&
El-Tohamy
E.
2019
Energy dissipation and discharge coefficient over stepped gabion and buttress gabion spillway
.
Technology
10
(
4
),
260
267
.
Baylar
A.
&
Bagatur
T.
2000
Aeration performance of weirs
.
Water Sa
26
(
4
),
521
526
.
Baylar
A.
,
Unsal
M.
&
Ozkan
F.
2011
GEP modeling of oxygen transfer efficiency prediction in aeration cascades
.
KSCE Journal of Civil Engineering
15
(
5
),
799
804
.
Bishop
C. M.
1995
Neural Networks for Pattern Recognition
.
Oxford University Press
,
Oxford, UK
.
Chanson
H.
2002
Hydraulics of Stepped Chutes and Spillways
.
CRC Press, Lisse
,
The Netherlands
.
Chinnarasri
C.
,
Donjadee
S.
&
Israngkura
U.
2008
Hydraulic characteristics of gabion-stepped weirs
.
Journal of Hydraulic Engineering
134
(
8
),
1147
1152
.
Emiroglu
M. E.
&
Baylar
A. H. M. E. T.
2003
An investigation of effect of stepped chutes with end sill on aeration performance
.
Water Quality Research Journal
38
(
3
),
527
539
.
Fischer
M. M.
1998
Computational neural networks: a new paradigm for spatial analysis
.
Environment and Planning A
30
(
10
),
1873
1891
.
Freund
Y.
&
Schapire
R. E.
1997
The strength of weak learnability
.
Journal of Computer and System Sciences
55
,
119
139
.
Gulliver
J. S.
,
Thene
J. R.
&
Rindels
A. J.
1990
Indexing gas transfer in self-aerated flows
.
Journal of Environmental Engineering
116
(
3
),
503
523
.
Kells
J. A
.
1994
Energy dissipation at a gabion weir with through flow and overflow
. In
Ann. Conference Can. Soc. Civ. Engrg
. pp.
26
35
.
Kumar
M.
,
Tiwari
N. K.
&
Ranjan
S.
2021
Experimental study on oxygen mass transfer characteristics by plunging hollow jets
.
Arabian Journal for Science and Engineering
46
(
5
),
4521
4532
.
Kumar
M.
,
Tiwari
N. K.
&
Ranjan
S.
2022
Soft computing based predictive modelling of oxygen transfer performance of plunging hollow jets
.
ISH Journal of Hydraulic Engineering
28
(
sup1
),
223
233
.
Lewis
W. K.
&
Whitman
W. G.
1924
Principles of gas absorption
.
Industrial & Engineering Chemistry
16
(
12
),
1215
1220
.
Luxmi
K. M.
,
Tiwari
N. K.
&
Ranjan
S.
2022
Application of soft computing approaches to predict gabion weir oxygen aeration efficiency
.
ISH Journal of Hydraulic Engineering
1
15
. https://doi.org/10.1080/09715010.2022.2050311.
Markofsky
M.
&
Kobus
H.
1978
Unified presentation of weir-aeration data
.
Journal of the Hydraulics Division
104
(
4
),
562
568
.
Nigrin
A.
1993
Neural Networks for Pattern Recognition
.
MIT Press
.
https://doi.org/10.7551/mitpress/4923.001.0001
.
Peyras
L. A.
,
Royet
P.
&
Degoutte
G.
1992
Flow and energy dissipation over stepped gabion weirs
.
Journal of Hydraulic Engineering
118
(
5
),
707
717
.
Raikar
R. V.
&
Kamatagi
P. B.
2015
Use of hydraulic phenomena in enhancement of dissolved oxygen concentration
.
International Journal of Research in Engineering and Technology
4
(
2
),
568
574
.
Salmasi
F.
,
Chamani
M. R.
&
Farsadi
Z. D
.
2012
Experimental study of energy dissipation over stepped gabion spillways with low heights
.
Singh
A.
,
Singh
B.
&
Sihag
P.
2021
Experimental investigation and modeling of aeration efficiency at labyrinth weirs
.
Journal of Soft Computing in Civil Engineering
5
(
3
),
15
31
.
Srinivas
R.
&
Tiwari
N. K.
2022
Oxygen aeration efficiency of gabion spillway by soft computing models
.
Water Quality Research Journal
57
(
3
),
215
232
.
Tiwari
N. K.
&
Sihag
P.
2020
Prediction of oxygen transfer at modified Parshall flumes using regression models
.
ISH Journal of Hydraulic Engineering
26
(
2
),
209
220
.
Verma
A.
,
Ranjan
S.
,
Ghanekar
U.
&
Tiwari
N. K
.
2022
Soft computing techniques for predicting aeration efficiency of gabion stepped weir
. In:
Proceedings of the International Conference on Industrial and Manufacturing Systems (CIMS-2020)
.
Springer
,
Cham
, pp.
117
122
.
Wormleaton
P. R.
&
Soufiani
E.
1998
Aeration performance of triangular planform labyrinth weirs
.
Journal of Environmental Engineering
124
(
8
),
709
719
.
Wuthrich
D.
&
Chanson
H.
2015
Aeration performances of a gabion stepped weir with and without capping
.
Environmental Fluid Mechanics
15
(
4
),
711
730
.
doi:10.1007/s10652-014-9377-9
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).