## Abstract

The current paper discussed the application and comparison of machine learning algorithms such as the gradient boosting machine (GBM), neural network (NN), and deep neural network (DNN) in estimating the oxygen aeration performance efficiency (OAPE_{20}) of the gabion spillways. Besides, traditional equations, namely developed multivariable linear regression (MLR) and multivariable nonlinear regression (MNLR) along with the previous models were also employed in estimating OAPE_{20} of the gabion spillways. Results in the testing phase showed that the DNN with the highest value of correlation (correlation of coefficient (CC) = 0.9713) and lowest values of errors (root mean square error (RMSE) = 0.1684, mean squared error (MSE) = 0.0283, and mean absolute error (MAE) = 0.1532) demonstrated the best results in estimating OAPE_{20} of the gabion spillways; however, other applied models such as GBM, NN, MLR, and MNLR were giving comparable results evaluated to statistical appraisal metrics, but previous studies were performing incredibly poor with the lowest value of correlation and highest values of errors. The datasets employed here were collected by conducting experiments. From the relative significance of input parameters, the Reynolds number (Re) was observed to be a crucial parameter. At the same time, the ratio of the mean size gabion materials to the length of the gabion spillway (*d*_{50}/*L*) had the least impact over the OAPE_{20} of the gabion spillways.

## HIGHLIGHTS

The test for the aeration performance efficiency of gabion spillways was studied.

Machine learning techniques were used for estimating the gabion spillway aeration efficiency.

The estimating potential of DNN, GBM, NN, etc., was compared.

The DNN model outperformed the other proposed models.

A sensitivity test was conducted to know the relative impact of the input variable on the output results.

### Graphical Abstract

## INTRODUCTION

Dissolved oxygen (DO) concentration is essential for determining the health of water bodies. The oxygen content of water is depleted by naturally occurring biological and chemical processes, which puts more stress on aquatic life in bodies of water and reduces DO levels (Baylar & Emiroglu 2004). Reaeration increases the oxygen in the water by drawing air from the surrounding atmosphere. However, for the drop structures like weirs, aeration is connected to form, roughness, plunging velocity, and geometry (Luxmi *et al.* 2022). By altering the flow using fluidic structures such as hydraulic jumps and hydraulic drops, it is possible to improve aeration.

Hydraulic structures enhance the amount of DO in a stream by self-aeration, even though the water is in contact with the structure for only a fraction of the time (Baylar *et al.* 2011). The primary cause of this faster oxygen transfer is the extraction of air and its penetration into the flow in the form of an enormous number of tiny bubbles. These micro air bubbles improve the surface area accessible for oxygen transfer, facilitating greater oxygen exchange (Baylar & Bagatur 2000). Gabion spillways are widely employed in an earthen dam, preservation of soil work, retaining structures, river-training work at the bend, and other projects. The gabion is stable, flexible, and simple to construct without losing structural integrity.

Furthermore, they can withstand considerable differential settlements, if any. If materials that may be found locally are readily accessible in large quantities, gabion structures will be a cost-effective solution. It is made up of a porous substance surrounded by a grid of metal wires and filled with coarser materials of various sizes and shapes. Its porosity assists in water drainage, lowering the water load behind the building. It is also possible to design and use stepped spillways in the gabion type.

Additionally, the flow over the steps cascades down the spillway face, causing aeration (Salmasi *et al.* 2012). Two types of flow over the stepped spillways are defined by Chanson (2002). They are (a) aerated flow and (b) non-aerated flow. Gabions can reduce water pressure by preserving rocks' permeability and flexibility (Aal *et al.* 2019). Zhang & Chanson (2016) classified flow over stepped spillways into three hydraulic regimes: skimming flow, nappe flow, and transition flow. The research concluded that a stepped weir gives good results compared to a flat one (Wuthrich & Chanson 2015). Wormleaton & Soufiani (1998) studied triangular labyrinth weirs and rectangular labyrinth weirs' oxygen aeration potential. In Africa, Sahel gabion spillways are the most normal spillway (Peyras *et al.* 1992). Kells (1994) addressed how discharge over the crest and critical depth relates to energy dissipation over gabion-stepped weirs. Chinnarasri *et al.* (2008) performed experimental research on the sensitivity of hydraulic performance and the properties of the material contained in gabions. The gabion spillway will also produce turbulence, enhancing aeration and enabling the organic materials' aerobic decomposition. It will also help aquatic life to move and migrate more easily (Luxmi *et al.* 2022). Soft computing data-driven techniques are applied to simulate the effectiveness of oxygen aeration at barrages, spillways, gabion spillways, Parshall flumes, Montana flumes, etc. Soft computing approaches are widely utilized since they are available, have built-in intelligence, and are dependable. In addition, they lessen the scale effect and prevent model creation. In the fields of hydraulic engineering and water resources, a wide variety of soft computing approaches have been used. Researchers have recently expressed interest in applying machine learning (ML)/soft computing approaches to forecast the aeration effectiveness of gabion weirs with and without steps (Luxmi *et al.* 2022; Srinivas & Tiwari 2022; Verma *et al.* 2022). Baylar *et al.* (2011) also studied the prediction of oxygen transfer efficiency in aeration stepped cascades using gene expression programming (GEP) modeling. Aeration efficiency at the Labyrinth Weir was investigated experimentally and through modeling (Singh *et al.* 2021). Kumar *et al*’s*.* (2022) work involves stimulating the oxygen transfer abilities to plunge jets using fundamental flow patterns or flow characteristics and forecasting the volume of oxygen transfer coefficient using modeling approaches such as artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), multivariate adaptive regression splines (MARS), MNLR, and generalised regression neural network (GRNN). Kumar *et al.* (2022) aim to investigate the impact of jet velocity, jet length, water depth, and jet thickness of plunging hollow jets in oxygenating the water in the aeration tank. Empirical correlations are proposed for the determination of the volumetric oxygen transfer coefficient, *K _{L}A*, based on jet velocities and jet kinetic powers.

### Theoretical background

#### Oxygen transfer process

*K*is the coefficient of bulk liquid film for oxygen,

_{L}*I*is the saturation intensity of DO in water,

_{S}*I*is the intensity of DO,

*A*is the area of surface associated with the volume

*V*, over which transfer occurs, and

*t*is the time.

The OAPE is the oxygen aeration performance efficiency, *I _{d}* is the intensity of DO downstream of the fluidic device,

*I*is the intensity of DO upstream of the fluidic device, and

_{u}*I*is the saturation intensity of DO. When the downstream water is supersaturated (

_{S}*I*<

_{S}*I*), then the OAPE > 1. Similarly, when full oxygen transfer reaches saturation value, then OAE = 1 and OAPE = 0 represent no oxygen transfer.

_{d}*et al.*(1990) presented an equation to illustrate the impact of temperature as

*r*’, which Markofsky & Kobus (1978) described as

#### Importance, aim, objectives, and novelty

The importance of the present study was to forecast and examine the usefulness of the gabion spillways by estimating the oxygenation aeration performance efficiency (OAPE_{20}) between the interface of air and water surface overflow and seepage through the previous spillway body. The novelty of the present work had many dimensions, as the current study outlined the evaluation of the OAPE_{20} by conducting experimental tests varying with dimensionless parameters, such as the Reynolds number (Re), Froude number (Fr), the ratio of the gabion means size particle to its length (*d*_{50}/*L*), and porosity (*n*) of the gabion materials. Second, the OAPE_{20} was estimated and compared with ML or soft computing models; deep neural network (DNN), neural network (NN), and gradient boosting machine (GBM) utilizing observed datasets. Third, these estimated values of the OAPE_{20} were further compared with developed multivariable linear regression (MLR) and multivariable nonlinear regression (MNLR) and proposed previous relations. Finally, a sensitivity study was also computed to have the relative impact of input parameters on the outcomes of the OAPE_{20}.

## MATERIALS AND METHODS

### Proposed modeling techniques

The present study used experimental data to estimate the aeration performance of gabion spillways utilizing traditional approaches such as MLR and MNLR, and proposed existing empirical relations and ML techniques such as NN, GBM, and DNN.

### Traditional models

Two regression models were employed in the current investigation to estimate the oxygen aeration performance efficiency (OAPE_{20}). One version of the regression equation uses MLR and the other uses MNLR.

in which *p*_{1}, *p*_{2}, *p*_{3}, … are proportionality constants.

*X*was the secondary variable and considered as the output variable;

*p*was the proportionality constant,

*Y*

_{1},

*Y*

_{2}…

*Y*were the primary variables selected as input parameters, and

_{n}*K*

_{1},

*K*

_{2},

*K*

_{3}…

*K*were constants of exponential. The relation was found through MNLR as follows:in which OAPE

_{n}_{20}is the oxygen aeration performance efficiency at 20 °C, Re is the Reynolds number, Fr is the Froude number,

*d*

_{50}is the mean size of the materials used in the gabion, spillway in cm, and

*n*is the porosity %. Table 1 shows the proposed derived and existing models available in the text.

Sr. No. . | Model origin . | Model . | Comments . |
---|---|---|---|

1 | Tiwari (2021) | Oxygen aeration efficiency for hydraulic jump | |

2 | Luxmi et al. (2022) | Oxygen aeration efficiency for gabion weir | |

3 | MLR | Present study | |

4 | MNLR | Present study |

Sr. No. . | Model origin . | Model . | Comments . |
---|---|---|---|

1 | Tiwari (2021) | Oxygen aeration efficiency for hydraulic jump | |

2 | Luxmi et al. (2022) | Oxygen aeration efficiency for gabion weir | |

3 | MLR | Present study | |

4 | MNLR | Present study |

### Gradient boosting machine

*i*is the equities over some test data of size, and

*n*is the amount of data in

*y*.

### NN and DNN

Machine learning includes the NN and the DNN. The NN and the DNN are made up of neurons that resemble those in the nervous system. Neurons are given biases and weights. These neural networks are built to function similarly to how neurons in the human brain do Fischer (1998). A brain neuron works by taking in information and then generating an output utilized by another cell. The NN security training also works on a similar pattern. They stimulate behavior by learning about the collected data and predicting outcomes (Nigrin 1993).

The NN employs a massive number of highly interconnected nodes (neurons) that work to solve specific problems, such as forecast and pattern classification (Bishop 1995). The NN is widely used to solve water resource problems and is a very popular soft computing technique. The NN and DNN include an input, a hidden, and an output layer. Nevertheless, the basic difference between the NN and DNN is that several hidden layers are present, and also several nodes are relatively more in the DNN than NN. In the case of a conventional NN, only one hidden layer is present, but in the DNN, the number of hidden layers is more than one. That is why the DNN is called a deep neural network. In the current study, the NN model was developed using MATLAB software, while the model of the DNN was developed using H_{2}O software. The structures of the NN and the DNN are shown, respectively, in Figure 1(b) and 1(c).

### Experimental program

*et al.*2003; Tiwari & Sihag 2020; Tiwari 2021). From the chemical mixing tank, water entered the channel through the headbox containing a screen, which dampened the eddies at the entry of the head box, if any. The water inflow was controlled by a regulator fitted in the pipe, as shown in Figure 2.

The water's temperature was measured manually using a thermometer with an accuracy of 0.5 °C in the range of −10 to 50 °C, and the DO was measured using the Azide-modification methods (Winkler method, Raikar & Kamatagi 2015). Configuration and dimension of the gabion spillway models are shown in Table 1. In an experiment, gabion spillway models with the appropriate fixing arrangement were installed at the downstream end of the channel. Underflow rates ranging from 0.5 to 5.2 L/s were used to test each gabion spillway. The porosity (*n*), Reynolds number (Re), Froude number (Fr), and the ratio of gabion mean sizes to the length of the gabion spillway (*d*_{50}/L) were calculated from the experimental datasets. See Table 2.

### Methodology

Water was mixed with sodium sulfite (Na_{2}SO_{3}) and cobalt chloride (COCl_{2}) in the proper amounts to lower the oxygen level between 1 and 2 mg/L, which was used to compute the DO level in each run. Water samples were analyzed using the azide-modification method after the flume had run for 85 s (Kumar *et al.* 2021). Care was also made to prevent water from coming into contact with air, except for models where air mixing happens due to gravity.

## MODEL PERFORMANCE METRICS

The model performance metrics of the proposed models were analyzed by statistical measures, i.e., the correlation of coefficient (CC) and root mean square error (RMSE), MSE, and mean absolute error (MAE).

*MAE:*The equation of the MAE is given subsequently in the following relation:where is the observed oxygen aeration performance efficiency at 20 °C, is the estimated oxygen aeration performance efficiency at 20 °C, and

*N*is the total number of observations.

Parameters . | Notations . | Value . | Range . | Units . | |
---|---|---|---|---|---|

From . | To . | ||||

Gabion spillway height | P | 20 | 20 | 20 | cm |

Gabion spillway width | B | 40 | 40 | 40 | cm |

Gabion spillway length | L | 25 | 25 | 25 | 25 |

Gabion mean size | d_{50} | 26.4, 29.4, and 49.20 | 26.4 | 49.20 | mm |

Porosity | n | 24, 44, and 57 | 24 | 57 | % |

Parameters . | Notations . | Value . | Range . | Units . | |
---|---|---|---|---|---|

From . | To . | ||||

Gabion spillway height | P | 20 | 20 | 20 | cm |

Gabion spillway width | B | 40 | 40 | 40 | cm |

Gabion spillway length | L | 25 | 25 | 25 | 25 |

Gabion mean size | d_{50} | 26.4, 29.4, and 49.20 | 26.4 | 49.20 | mm |

Porosity | n | 24, 44, and 57 | 24 | 57 | % |

### Datasets

A total number of 161 experimental readings were utilized for making the model. The input datasets comprise the Reynolds number (Re), mean sizes of gabion (*d*_{50}), Froude number (Fr), the ratio of mean sizes of gabion to the length of gabion spillway (*d*_{50}/*L*), porosity(*n*), and output data were the oxygen aeration performance efficiency (OAPE_{20}). For training, 75% of the total data were taken, and the net left out 25% of the total datasets were utilized for testing. The statistical summary details of training and testing are shown in Table 3.

Variables . | Mini . | Max . | Mean . | Std. . | Kurtosis . | Skewness . |
---|---|---|---|---|---|---|

Train | ||||||

Re | 2000 | 20,400 | 12836.67 | 5869.848 | −1.1799 | −0.4612 |

Fr | 7.504 | 22.686 | 11.112 | 3.470 | 2.5780 | 1.6542 |

0.04 | 0.314 | 0.138 | 0.0725 | 0.1778 | 0.7393 | |

N | 24 | 57 | 41.368 | 8.084 | 0.1666 | 0.2725 |

OAPE_{20} | 0.013 | 0.559 | 0.341 | 0.116 | 0.0513 | −0.6163 |

Test | ||||||

Re | 3200 | 20,400 | 13,380 | 6209.96 | −1.3793 | −0.4666 |

Fr | 7.595 | 3200 | 10.583 | 2.514 | −0.5271 | 0.7879 |

0.04 | 0.314 | 0.1266 | 0.0684 | 1.0998 | 1.0918 | |

N | 24 | 57 | 39.821 | 7.269 | 1.0839 | 0.1998 |

OAPE_{20} | 0.039 | 0.584 | 0.321 | 0.154 | −0.9897 | −0.2886 |

Variables . | Mini . | Max . | Mean . | Std. . | Kurtosis . | Skewness . |
---|---|---|---|---|---|---|

Train | ||||||

Re | 2000 | 20,400 | 12836.67 | 5869.848 | −1.1799 | −0.4612 |

Fr | 7.504 | 22.686 | 11.112 | 3.470 | 2.5780 | 1.6542 |

0.04 | 0.314 | 0.138 | 0.0725 | 0.1778 | 0.7393 | |

N | 24 | 57 | 41.368 | 8.084 | 0.1666 | 0.2725 |

OAPE_{20} | 0.013 | 0.559 | 0.341 | 0.116 | 0.0513 | −0.6163 |

Test | ||||||

Re | 3200 | 20,400 | 13,380 | 6209.96 | −1.3793 | −0.4666 |

Fr | 7.595 | 3200 | 10.583 | 2.514 | −0.5271 | 0.7879 |

0.04 | 0.314 | 0.1266 | 0.0684 | 1.0998 | 1.0918 | |

N | 24 | 57 | 39.821 | 7.269 | 1.0839 | 0.1998 |

OAPE_{20} | 0.039 | 0.584 | 0.321 | 0.154 | −0.9897 | −0.2886 |

## RESULTS AND DISCUSSION

The dataset was obtained from experimental observations, and proposed traditional methods (MLR and MNLR), existing predictive equations, and soft computing or ML techniques (GBM, NN, and DNN) are used as modeling techniques.

### The GBM model

The GBM model was developed using free auto-ML H_{2}O software. Many models were developed by changing the percentage of the division of training data (calibrating data) and testing data (validating data). A total number of 161 datasets were used for modeling. Finally, 75% of training data (124) and 25% of testing data (37) were found suitable for estimating the best model. A new model was fitted when each weak learner was added to give a more precise estimate of the response variable. The negative gradient of the function linked to the entire ensemble and the new weak learners were most strongly coupled. The GBM's primary goal was to produce a better prediction model by combining several relatively weak prediction models. These were essential in estimating accurate values by considering the minimum reckoning cost. The calibrating (training) data dataset was split into five folds, each comprising 25 data, whereas testing (validating) data was equally split into five folds, each comprising secen data. The principal parameters were tuned by methods of hit-and-trial, as shown in Table 4.

Sr. No. . | GBM parameters name . | Value/type . |
---|---|---|

1 | N folds | 5 |

2 | Fold assignment | Modulo |

3 | ntree | 98 |

4 | Max depth | 3 |

5 | Distribution | Gaussian |

6 | Categorical encoding | Enum |

7 | Column sample rate | 0.4 |

8 | Row sample per tree | 0.6 |

Sr. No. . | GBM parameters name . | Value/type . |
---|---|---|

1 | N folds | 5 |

2 | Fold assignment | Modulo |

3 | ntree | 98 |

4 | Max depth | 3 |

5 | Distribution | Gaussian |

6 | Categorical encoding | Enum |

7 | Column sample rate | 0.4 |

8 | Row sample per tree | 0.6 |

*x*-axis). However, Figure 3(b) shows the scatter plot between observed and predicted OAPE by the GBM of the gabion spillways for training and testing datasets. From the perusal of Figure 3(b)

**,**it was clear that all estimated values, either in testing or training, were lying along the perfect line, which implied that the GBM model was performing well. Furthermore, by observing Table 5, the above contention was again buttressed that GBM was performing well and could be used in the estimation of the OAPE

_{20}for the gabion spillway as the value of CC was higher and error values were smaller.

Proposed approaches . | CC . | RMSE . | MSE . | MAE . |
---|---|---|---|---|

Train data | ||||

Tiwari (2021) | 0.7599 | 0.9157 | 0.8386 | 0.9139 |

Luxmi et al. (2022) | 0.5147 | 0.7241 | 0.5243 | 0.7065 |

MLR | 0.8616 | 0.2228 | 0.0496 | 0.2093 |

MNLR | 0.8370 | 0.4649 | 0.2162 | 0.0588 |

NN | 0.9155 | 0.1753 | 0.0307 | 0.1559 |

GBM | 0.9833 | 0.1383 | 0.0191 | 0.1258 |

DNN | 0.9744 | 0.1430 | 0.02040 | 0.1297 |

Test data | ||||

Tiwari (2021) | 0.9054 | 0.9270 | 0.86111 | 0.1460 |

Luxmi et al. (2022) | 0.4477 | 0.7160 | 0.5127 | 0.6962 |

MLR | 0.9253 | 0.1197 | 0.01433 | 0.0189 |

MNLR | 0.9045 | 0.3106 | 0.06953 | 0.2624 |

NN | 0.9368 | 0.1867 | 0.0348 | 0.1639 |

GBM | 0.93167 | 0.1850 | 0.0342 | 0.1665 |

DNN | 0.9713 | 0.1684 | 0.0283 | 0.1532 |

Proposed approaches . | CC . | RMSE . | MSE . | MAE . |
---|---|---|---|---|

Train data | ||||

Tiwari (2021) | 0.7599 | 0.9157 | 0.8386 | 0.9139 |

Luxmi et al. (2022) | 0.5147 | 0.7241 | 0.5243 | 0.7065 |

MLR | 0.8616 | 0.2228 | 0.0496 | 0.2093 |

MNLR | 0.8370 | 0.4649 | 0.2162 | 0.0588 |

NN | 0.9155 | 0.1753 | 0.0307 | 0.1559 |

GBM | 0.9833 | 0.1383 | 0.0191 | 0.1258 |

DNN | 0.9744 | 0.1430 | 0.02040 | 0.1297 |

Test data | ||||

Tiwari (2021) | 0.9054 | 0.9270 | 0.86111 | 0.1460 |

Luxmi et al. (2022) | 0.4477 | 0.7160 | 0.5127 | 0.6962 |

MLR | 0.9253 | 0.1197 | 0.01433 | 0.0189 |

MNLR | 0.9045 | 0.3106 | 0.06953 | 0.2624 |

NN | 0.9368 | 0.1867 | 0.0348 | 0.1639 |

GBM | 0.93167 | 0.1850 | 0.0342 | 0.1665 |

DNN | 0.9713 | 0.1684 | 0.0283 | 0.1532 |

### The NN model

_{20}), the NN approach was considered in forecasting the model. The NN consists of a manifold layer and every layer has nodes (neurons). The layer was joined with a weighted connection (coefficients of weights). Usually, three categories of the layer are formed in the artificial NN: the first layer signifies (signal) input, the hidden (middle) layer for evaluating input weights, and the output layer is the final layer. In three stages, the NN was established: in the initial stage, training data was prepared, and the second stage needed various positioning and assembly of effective network architectures. Furthermore, the final stage was testing. The number of neurons and hidden layers were selected using hit-and-trial methods, which estimated the desired results. The optimal principal parameters of the NN are established by the trial-and-error method, as shown in Table 6. Figure 4 represents the NN-based model scattered plot between the observed OAPE

_{20}and its estimated values for training and testing datasets. It was observed that barring some estimated points for testing datasets, all estimated points were lying near the perfect line. By observing Table 5, it was further evidence that NN was performing well and could be used in estimating the OAPE

_{20}for the gabion spillway as the value of CC was more significant and error values were lesser.

Sr. No. . | NN parameters name . | Value/type . |
---|---|---|

1 | No of nodes | 8 |

2 | Epochs | 245 |

3 | Hidden layer | 1 |

4 | Models type | Bayesian regularization (trainbr) |

Sr. No. . | NN parameters name . | Value/type . |
---|---|---|

1 | No of nodes | 8 |

2 | Epochs | 245 |

3 | Hidden layer | 1 |

4 | Models type | Bayesian regularization (trainbr) |

### The DNN model

The DNN model was also developed using the free auto-ML H2O software. Several models were created by changing the ratio of the division between training data and testing datasets. For modeling, a total of 161 datasets were employed. Finally, 25% of the testing data (40) and 75% of the training data (121) are considered appropriate for predicting the best model. In a DNN, the initial stage was to find the number of epochs. These were essential in forecasting accurate values by considering minimum reckoning costs. The dataset of training records was split into five folds, each comprising 25 data, whereas testing data were split into five folds, each comprising eight data. The optimized key parameters were tuned by methods of trial-and-error, as shown in Table 7.

Sr. No. . | DNN parameters name . | Value/type . |
---|---|---|

1 | N folds | 5 |

2 | Fold assignment | Module |

3 | Response column (output variable) | OAPE_{20} |

4 | Activation | Rectifier with dropout |

5 | Hidden layers | 100,100,100 |

6 | Epochs | 4,000 |

7 | Distribution | Gaussian |

8 | Categorical encoding | One hot internal |

Sr. No. . | DNN parameters name . | Value/type . |
---|---|---|

1 | N folds | 5 |

2 | Fold assignment | Module |

3 | Response column (output variable) | OAPE_{20} |

4 | Activation | Rectifier with dropout |

5 | Hidden layers | 100,100,100 |

6 | Epochs | 4,000 |

7 | Distribution | Gaussian |

8 | Categorical encoding | One hot internal |

*x*-axis. However, Figure 5(b) shows the scatter plot between observed and estimated OAPE by the DNN of the gabion spillways for training and testing datasets. From the perusal of Figure 5(b), it is clear that all estimated values, either in testing or training, lie along the perfect line, implying that the DNN was the most performing model. This contention was further substantiated by observing Table 5, where the value of CC was highest, and error values were the lowest among all proposed models.

### The MLR, MNLR, and published traditional models

*et al.*(2022), and Tiwari (2021) with the observed results of the OAPE

_{20}for training and testing datasets. From Figure 6, it was observed that MLR and MNLR have shown the best results compared to Luxmi

*et al.*(2022) and Tiwari (2021) as estimated points by MLR and MNLR lie near to perfect line, while Luxmi

*et al.*(2022) model was underestimated as estimated data points lie below the perfect line and Tiwari (2021) model was overestimated as estimated data points lie above the perfect line for both training and testing conditions. Beside, observing Table 5, it is again clear that both MLR and MNLR models had fewer errors compared to Luxmi

*et al.*(2022) and Tiwari (2021) models. However, the MLR model with a higher value of CC = 0.9253 and lower values of RMSE = 0.1197, MSE = 0.01433, and MAE = 0.0189 is performing better than MNLR with a lower value of CC = 0.9045 and higher values of RMSE = 0.3106, MSE = 0.06953 and MAE = 0.2624.

### The comparison

The models developed using datasets of gabion spillways were compared using appraisal parameters shown in Table 5. All the proposed models, namely MLR, MNLR, GBM, NN, and DNN models are efficiently predicted the OAPE_{20} of the gabion spillways. However, the DNN model outperformed all the proposed models in estimating the OAPE of the gabion spillways. The DNN model has the highest CC value and lowest error values, as shown in Table 5. However, all proposed soft computing models performed better than traditional models. However, both the MLR and MNLR models performed well and could be utilized in estimating the OPAE_{20}.

_{20}fall around the perfect line. Four more error lines in the domain of ±25 and ±10%. were also drawn between the estimated and observed values of the OAPE

_{20}of the gabion spillways. Figure 7 showed that most of the estimated values of the OAE

_{20}by NN and DNN were lying well within the ±10% error line from the perfect agreement line in both training and testing cases, but some values of the GBM model were lying beyond the ±25%. So, barring some estimated points, all estimated values by the soft computing algorithms lie in the range of ±25% error lines for the training and testing dataset. Furthermore, it could be drawn the inference from Figure 7 that for dimensionless datasets, DNN and NN are performing well as their estimated points lie within the ±10% error band for both training and testing datasets; however, all other considered ML models give values that lie in the range ±25% error band.

The above contention was further corroborated by Figure 7, where it is clear that predicted values by the DNN model were lying near the observed values, followed by NN and GBM models. Besides, Table 5 also further substantiated that the value of the correlation coefficient (CC = 0.9713) was the highest, and the value of errors (RMSE = 0.1684, MSE = 0.06953, and MAE = 0.1532 were lowest for the DNN model which shows the best-performing model followed by the NN model. Nevertheless, the GBM model performed at par with the proposed ML-based models. But, in the case of traditional models, both MLR and MVLR models were giving comparable performance but the proposed previously existing models (Tiwari 2021; Luxmi *et al.* 2022) were performing very poorly for training and testing datasets. The summary statistics of predicted results by all proposed models are presented in Table 8 for the training and testing dataset.

Models . | Min . | Max . | Mean . | Std. . | Kurtosis . | Skewness . |
---|---|---|---|---|---|---|

Train data | ||||||

Tiwari (2021) | 0.7911 | 1.4801 | 1.180 | 0.1573 | −0.06329 | −0.7445 |

Luxmi et al. (2022) | 0.3544 | 1.3216 | 0.8660 | 0.2517 | −0.9375 | −0.1427 |

MLR | 0.1251 | 0.5265 | 0.3420 | 0.1004 | −1.0333 | −0.30653 |

MNLR | 0.1696 | 0.4674 | 0.3474 | 0.0672 | −0.1500 | −0.81250 |

NN | 0.0224 | 0.5163 | 0.3447 | 0.1047 | 0.2540 | −0.7472 |

GBM | 0.0130 | 0.5840 | 0.3302 | 0.1294 | −0.2996 | −0.5611 |

DNN | 0.0310 | 0.5284 | 0.3363 | 0.1119 | 0.05137 | −0.61634 |

Test data | ||||||

Tiwari (2021) | 0.8270 | 1.4706 | 1.1825 | 0.1782 | −0.4004 | −0.6294 |

Luxmi et al. (2022) | 0.4340 | 1.3249 | 0.8341 | 0.2227 | −0.2332 | 0.3886 |

MLR | 0.1338 | 0.5737 | 0.3357 | 0.1239 | −1.1078 | −0.09865 |

MNLR | 0.2077 | 0.4867 | 0.3448 | 0.0739 | −0.7030 | −0.3374 |

NN | 0.0887 | 0.5394 | 0.3234 | 0.1364 | −1.2097 | −0.2527 |

GBM | 0.0617 | 0.599 | 0.3521 | 0.1293 | −0.4030 | −0.7237 |

DNN | 0.0668 | 0.5426 | 0.3195 | 0.1446 | −1.1685 | −0.2935 |

Models . | Min . | Max . | Mean . | Std. . | Kurtosis . | Skewness . |
---|---|---|---|---|---|---|

Train data | ||||||

Tiwari (2021) | 0.7911 | 1.4801 | 1.180 | 0.1573 | −0.06329 | −0.7445 |

Luxmi et al. (2022) | 0.3544 | 1.3216 | 0.8660 | 0.2517 | −0.9375 | −0.1427 |

MLR | 0.1251 | 0.5265 | 0.3420 | 0.1004 | −1.0333 | −0.30653 |

MNLR | 0.1696 | 0.4674 | 0.3474 | 0.0672 | −0.1500 | −0.81250 |

NN | 0.0224 | 0.5163 | 0.3447 | 0.1047 | 0.2540 | −0.7472 |

GBM | 0.0130 | 0.5840 | 0.3302 | 0.1294 | −0.2996 | −0.5611 |

DNN | 0.0310 | 0.5284 | 0.3363 | 0.1119 | 0.05137 | −0.61634 |

Test data | ||||||

Tiwari (2021) | 0.8270 | 1.4706 | 1.1825 | 0.1782 | −0.4004 | −0.6294 |

Luxmi et al. (2022) | 0.4340 | 1.3249 | 0.8341 | 0.2227 | −0.2332 | 0.3886 |

MLR | 0.1338 | 0.5737 | 0.3357 | 0.1239 | −1.1078 | −0.09865 |

MNLR | 0.2077 | 0.4867 | 0.3448 | 0.0739 | −0.7030 | −0.3374 |

NN | 0.0887 | 0.5394 | 0.3234 | 0.1364 | −1.2097 | −0.2527 |

GBM | 0.0617 | 0.599 | 0.3521 | 0.1293 | −0.4030 | −0.7237 |

DNN | 0.0668 | 0.5426 | 0.3195 | 0.1446 | −1.1685 | −0.2935 |

### Sensitivity study

*d*

_{50}/

*L*) proved to be the least sensitive parameter since it read out a minimum of 0.78 on scaled Importance.

## CONCLUSIONS

In the present study, the modeling of OAPE of gabion spillways (OAPE_{20}) was investigated by traditional methods, including MLR and MNLR, and considered existing empirical relations and proposed ML techniques, namely GBM, NN, and DNN using an experimental dataset. From the above works, the following key conclusions were drawn.

The performance evaluation of the three ML techniques was carried out based on statistical indices like the CC, RMSE, MSE, and MAE. These three ML techniques models utilized experimental datasets to estimate the OAPE_{20} of the gabion spillways. Out of these three ML models, it was observed that the DNN model was the best performing in both training and testing as the highest values of CC = 0.9744, and lowest values of RMSE = 0.1430, MSE = 0.02040, and MAE = 0.1297 for the training and highest value CC = 0.9713, and lowest values of RMSE = 0.1684, MSE = 0.0283, and MAE = 0.1532 for the testing in comparison to other proposed models.

This study further presented that the NN model (CC = 0.9368, RMSE = 0.1867, MSE = 0.0348, and MSE = 0.1639 in testing) had sufficient potential for estimating gabion spillway OAPE. It was the second-best-performing model after the DNN, in which the number of neurons in the hidden layer was found to be more sensitive, and its optimum value was eight.

The GBM model with metrics of CC = 0.93167, RMSE = 0.1850, MSE = 0.0342, and MSE = 0.1665 in testing could be used to estimate the OAPE

_{20}of the gabion spillways but was found to be the least-performing model compared to the other proposed ML-based models.Both MLR and MNLR models were also executing well, but the MLR with CC = 0.9253, in testing, performed better than the MNLR with a CC value of 0.9045 in the estimating gabion spillways oxygen aeration performance efficiency (OAPE

_{20}). However, the proposed previous models are performing poorly due to high errors and small correlation values for these datasets.The sensitivity study suggests that the Reynolds number (Re) was the most sensitive. At the same time, the ratio of the mean size gabion material to the length of the gabion spillway (

*d*_{50}/L) is the least sensitive parameter.

## ACKNOWLEDGEMENTS

R.S. is thankful to the Ministry of Human Resources, Government of India, and the Director, National Institute of Technology Kurukshetra (Haryana) for the monetary support of the present work for the Master Degree (MTech) scholarship (32012514).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.