Abstract
Accurate prediction of water surface profile in an open channel is the key to solving numerous critical engineering problems. The goal of the current research is to predict the water surface profile of a compound channel with converging floodplains using machine learning approaches, including gene expression programming (GEP), artificial neural networks (ANNs), and support vector machines (SVMs), in terms of both geometric and flow variables, as past studies were more focused on geometric variables. A novel equation was also proposed using gene expression programming to predict the water surface profile. In order to evaluate the performance and efficacy of these models, statistical indices are used to validate the produced models for the experimental analysis. The findings demonstrate that the suggested ANN model accurately predicted the water surface profile, with coefficient of determination (R2) of 0.999, root mean square error (RMSE) of 0.003, and mean absolute percentage error (MAPE) of 0.107%, respectively, when compared with GEP, SVM, and previously developed methods. The study confirms the application of machine learning approaches in the field of river hydraulics, and forecasting water surface profile of nonprismatic compound channels using a proposed novel equation by gene expression programming made this study unique.
HIGHLIGHTS
The present study predicted the water surface profile in nonprismatic compound channels with the help of various machine learning approaches.
The water surface profile is found to be affected by many nondimensional geometric and flow variables.
The findings depict that the water surface profile predicted using ANN is in good agreement with the observed water surface profile.
Graphical Abstract
NOTATION
The following symbols are used in this paper:
- B
total width of compound channel;
- b
width of the main channel;
- h
height of the main channel;
- H
flow depth;
- α
width ratio (B/b);
- β
relative flow depth [(H – h)/H];
- δ
aspect ratio (b/h);
- θ
converging angle;
- Xr
relative distance (x/L);
- L
converging length;
- x
distance between two consecutive sections;
- So
longitudinal bed slope;
- Se
energy slope;
- Qr
discharge ratio (Q/Qb);
- Q
discharge at any depth;
- Qb
bankfull discharge;
- Fr
Froude number;
- Re
Reynolds number;
- Ψ
nondimensional water surface profile (H/h).
INTRODUCTION
Increased human settlements, buildings, and activities along river floodplains have resulted in severe repercussions during natural river floods due to the global population rise. River floods cause massive human casualties as well as economic damage. Flood catastrophes account for a third of all-natural disaster damages worldwide; flooding accounts for half of all fatalities, with trend analysis revealing that these percentages have dramatically grown (Berz 2000). Flood protection needs to predict the conveyance capacity of natural streams precisely. When water running through a channel exceeds the waterway's capacity, it results in flooding. Consequently, the requirement for precise flow parameter prediction during flood conditions to limit damage and save lives and property has piqued the interest of academics and engineers in recent years. Various methodologies and procedures have been used to aid precise measurement and forecast of river discharge, velocity distribution, shear stress distribution, and water surface level during overbank flows. Compound channels are the most common river feature during overbank flow. During the course of a river's flow, the geometry of the floodplain changes, resulting in a compound channel that is either converging or diverging. It is more challenging to replicate flow in a nonprismatic compound channel because more momentum is carried from the main channel to the floodplains. Sellin (1964), Myers & Elsawy (1975), Knight et al. (2010), and Khatua et al. (2012) have explored the flow models of straight and meandering prismatic two-stage channels, but little is known about nonprismatic compound channels. A converging channel shape causes the flow on floodplains to rise, while the flow on diverging floodplains is reduced (James & Brown 1977). Compound channels with symmetrically declining floodplains were studied by Bousmar & Zech (2002), Bousmar et al. (2004), Rezaei (2006), and Rezaei & Knight (2009) and found the extra loss of head and transfer of momentum from the main channel to floodplains. Asymmetric geometry with a greater convergence rate was examined by Proust et al. (2006). A greater convergence angle (22°) results in increased mass transfer and head loss. Chlebek et al. (2010) studied the flow behavior of skewed, two-stage converging, and diverging channels. They observed increased head losses due to the mass and momentum transfer, homogenization of the velocity on contracting floodplains, and increased velocity gradient on the expanding floodplains. Due to changes in the flow force from one subsection to another between the main channel and floodplains, there are noticeable variances in the flow distribution. In compound channels with nonprismatic floodplains, Rezaei & Knight (2011) investigated depth-averaged velocity, local velocity distributions, and boundary shear stress distributions at various convergence angles. The depth-averaged velocities show how contractions affect velocity distributions, specifically, an increase in velocity near the main channel walls, most notably in the second half of the convergence reach, and for high relative depth, the effects of the lateral flow that comes into the main channel. Yonesi et al. (2013) investigate the impact of floodplains’ roughness on overbank flow in compound channels with nonprismatic floodplains. The velocity gradient between the main channel and the floodplain in the middle and end of the divergence stretch is lowered by raising the depth ratio or lowering the roughness ratio. The gradient of velocity increased as the angle of divergence increased. The gradient of shear stress rises as the surface roughness of the floodplain increases. Naik & Khatua (2016) developed a multivariate regression model to predict the water surface profile for different compound channels with the nonprismatic floodplain using nondimensional geometric factors. The constructed model has a high level of concordance with both the empirical evidence and the findings of other scholars. In nonprismatic compound channels, the effect of flow parameters on the water surface profile and variation of flow characteristics at higher flow rates has received much less attention. As a consequence, additional experiments are being carried out at higher flow rates on nonprismatic compound channel with converging floodplains to simulate the water surface profile.
It is complicated to evaluate the connections between the components that rely on other factors and those that are independent by developing a water surface profile model using mathematical, analytical, or numerical methods. These models end up being quite cumbersome and time-consuming as they progress. Not only is the amount of time spent conducting experiments cut in half as a result, but also the amount of time spent doing labor-intensive computations is reduced. Due to the increasing reliance on machine learning algorithms to estimate flow in compound channels, these channels are increasingly calculated using support vector machines (SVM), gene expression programming (GEP), artificial neural networks (ANN), fuzzy neural networks (ANFIS), and the M5 tree decision models (Seckin 2004; Unal et al. 2010; Sahu et al. 2011; Zahiri & Azamathulla 2014; Najafzadeh & Zahiri 2015; Parsaie et al. 2017). The GEP's capacity to generate mathematical correlations distinguishes it from other soft computing approaches such as ANN and SVM (Cousin & Savic 1997; Drecourt 1999; Savic et al. 1999; Whigham & Crapper 1999, 2001; Babovic & Keijzer 2002; Karimi et al. 2015). However, river engineering using the GEP method has received far less attention (Harris et al. 2003; Giustolisi 2004; Guven & Gunal 2008; Guven & Aytek 2009; Azamathulla et al. 2013; Pradhan & Khatua 2017b). Parsaie et al. (2015) use the support vector machine (SVM) technique to predict the discharge in the compound open channel. The analysis of the error indices demonstrate that the SVM has the highest level of accuracy. Khuntia et al. (2018) developed an artificial neural network (ANN) model for predicting boundary shear stress distribution in straight compound channels using the most influential parameters such as width ratio, relative flow depth, aspect ratio, Reynolds number, and Froude number. Back-propagation neural network (BPNN) models performed well over global ranges of independent parameter values. For the estimation of discharge in diverging and converging compound channels, an equation has been created by Das et al. (2019). This equation encourages the usage of GEP. Mohanta & Patra (2021) have developed a model equation for calculating discharge in meandering compound channels, validating the use of GEP over the classic channel division technique. For meandering compound channels with relative roughness, Mohanta et al. (2021) used several AI methods, including multivariate adaptive regression splines (MARS), a group method of data handling Neural Network (GMDH-NN), and gene-expression programming (GEP), to develop model equations. Compared to GEP and MARS, the results show that the suggested GMDH-NN model accurately predicted the values. In order to simulate the flow of the Hablehroud River in north-central Iran, Esmaeili-Gisavandani et al. (2021) studied five hydrological models, including the soil and water assessment tool (SWAT), identification of unit hydrograph and component flows from rainfall, evapotranspiration, and streamflow (IHACRES), Hydrologiska Byrns Vattenbalansavdelning (HBV), Australian water balance model (AWBM), It was discovered that the calibration phase findings for SWAT, IHACRES, and HBV were good. Only the SWAT model, however, performed well and outperformed the other models throughout the validation phase. The GEP combination approach may combine model results from other, less accurate models to get a better indication of river flow. To forecast the snow depth (SD) one day in advance at the North Fork Jocko snow telemetry (SNOTEL) station in the city of Missoula, Montana State of the United States, Adib et al. (2021a) proposed a novel algorithm that combines various wavelet transform (WT) approaches, including discrete wavelet transform (DWT), maximal overlap discrete wavelet transform (MODWT), and multiresolution-based MODWT (MODWT-MRA). The findings validated wavelet-based models’ superiority over solo ones. This shows that the innovative wavelet-based model is a method that merits further investigation for its potential to deliver useful information across snow-covered areas. Adib et al. (2021b) demonstrate the use of wavelet transforms, such as discrete wavelet transform, maximum overlap discrete wavelet transforms (MODWT), multiresolution-based MODWT (MODWT-MRA), and wavelet packet transform (WP), in combination with artificial intelligence (AI)-based models, such as multi-layer perceptron's, radial basis functions, adaptive neuro-fuzzy inference systems (ANFIS), and gene expression programming, to retrieve snow depth (SD) from the national snow and ice data center. According to the findings, the WP combined with ANFIS (WP-ANFIS) performed better than the other analyzed models in terms of statistical analysis. Mohseni & Naseri (2022) used ANN and SVM to estimate the water surface profile in compound channels with vegetated floodplains. According to the results, the SVM algorithm performed better than the ANN and regression models. According to sensitivity analysis, the water surface profile depended mainly on relative discharge and relative depth. Naik et al. (2022) proposed a novel equation using GEP to predict water surface profile in converging compound channels using geometric variables.
In past studies, the water surface profile of compound channels with converging floodplains was predicted in terms of geometric variables using the GEP technique only. Therefore, this research has been conducted to predict the water surface profile of compound channels with converging floodplains in terms of both geometric and flow parameters using machine learning techniques such as GEP, ANN, and SVM. Comparisons among these techniques and approaches from other researchers are made with the help of statistical analysis to evaluate the effectiveness of the developed models for predicting the water surface profile of nonprismatic compound channels.
MATERIALS AND METHODS
Data source
Verified Test Channel . | Type of Channel . | Angle of Convergent (θ) . | Longitudinal Slope (S) . | Cross-sectional Geometry . | Total Channel Width (B) (m) . | Main Channel Width (b) (m) . | Main Channel Depth (h) (m) . | Converging Length (Xr) (m) . | Aspect ratio (δ) . |
---|---|---|---|---|---|---|---|---|---|
Channel 1 Rezaei (2006) | Converging | 11.31° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 2 | 7.96 |
Channel 2 Rezaei (2006) | Converging | 3.81° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 6 | 7.96 |
Channel 3 Rezaei (2006) | Converging | 1.91° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 6 | 7.96 |
Channel 1 Naik & Khatua (2016) | Converging | 5° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 2.28 | 5 |
Channel 2 Naik & Khatua (2016) | Converging | 9° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 1.26 | 5 |
Channel 3 Naik & Khatua (2016) | Converging | 12.38° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 0.84 | 5 |
Present Channel | Converging | 4° | 0.001 | Rectangular | 1.0 | 0.5 | 0.25 | 3.60 | 2 |
Verified Test Channel . | Type of Channel . | Angle of Convergent (θ) . | Longitudinal Slope (S) . | Cross-sectional Geometry . | Total Channel Width (B) (m) . | Main Channel Width (b) (m) . | Main Channel Depth (h) (m) . | Converging Length (Xr) (m) . | Aspect ratio (δ) . |
---|---|---|---|---|---|---|---|---|---|
Channel 1 Rezaei (2006) | Converging | 11.31° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 2 | 7.96 |
Channel 2 Rezaei (2006) | Converging | 3.81° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 6 | 7.96 |
Channel 3 Rezaei (2006) | Converging | 1.91° | 0.002 | Rectangular | 1.2 | 0.398 | 0.05 | 6 | 7.96 |
Channel 1 Naik & Khatua (2016) | Converging | 5° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 2.28 | 5 |
Channel 2 Naik & Khatua (2016) | Converging | 9° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 1.26 | 5 |
Channel 3 Naik & Khatua (2016) | Converging | 12.38° | 0.0011 | Rectangular | 0.9 | 0.5 | 0.1 | 0.84 | 5 |
Present Channel | Converging | 4° | 0.001 | Rectangular | 1.0 | 0.5 | 0.25 | 3.60 | 2 |
Experimental setup
Theoretical background
The water surface profile of the compound channel with different converging floodplains was attempted to be predicted. The flow is found to be non-uniform in the nonprismatic section, contrary to be anticipated until it reaches the prismatic zone. These channels have a uniformly smooth surface on their main channel and floodplains. Manning's n values for all of these smooth surfaces are found to be 0.013. The majority of influencing factors, including width ratio (α), relative depth ratio (β), aspect ratio (δ), converging angle (θ), relative distance (Xr), longitudinal slope (So), energy slope (Se), discharge ratio (Qr), Froude's number (Fr) and Reynold's number (Re), are taken into consideration while estimating the water surface profile of nonprismatic compound channels. A number of parameters are used to enable the model equation to be applied to diverse compound channels.
GEP model
Ferreira first put up the idea for the GEP approach in 1999. Genetic programming (GP) and genetic algorithms (GA) are brought together in this approach. The plain and linear chromosomes are joined with structures of fixed length and branching that vary in size and form (similar to the parse tree in genetic programming). Phenotype and genotype may be distinguished using this methodology due to the fact that all branch structures, despite their varying dimensions and configurations, are recorded in linear chromosomes of a predetermined length. GEP is a multigene, one-of-a-kind coding language that permits the alteration of increasingly sophisticated equations by dividing them into many sub-equations. Gene generations, fitness-based gene selection, and the introduction of genetic diversity via using one or more genetic operators are also utilized in this process. The author develops an equation for the nondimensional water surface profile parameter (Ψ) of a non-prismatic compound channel by using a GEP technique with the assistance of GeneXproTools 5.0 (2014). The development of models is decided upon according to the appropriateness of the datasets used for training and testing. The selected models are recreated in GEP using one or more genetic operators, such as mutation or recombination. Previous studies provide a concise conceptual outline of GEP in their descriptions (Mallick et al. 2020). The relationship (Equation (23)) illustrates the water surface profile of nonprismatic compound channels varies as a function of the geometric and hydraulic factors. In this study, the modeling procedure uses Ψ as the target value and the ten independent factors (α, β, δ, θ, Xr, So, Se, Qr, Fr, Re) as input variables discussed in Equation (23). The structure of the model is built with the help of the four basic operators of arithmetic (+, −, ×, /). In all, 396 data sets are used, and they are dispersed at random over the two distinct stages of the modeling process. For the purpose of the present investigation, training makes use of 70% of the data, while testing makes use of the remaining 30%. In this investigation, the root mean squared error (RMSE) served as the fitness function (Ei), and the fitness (fi) was determined using Equation (24), which defines the total sum of all errors relative to the goal value. The initial model was developed using only one gene and two different head sizes as its starting point. The number of genes and heads was then gradually raised during each run, and the results of the training and testing datasets were recorded after each iteration. There was not a significant improvement in performance between the training data phase and the testing data phase for head lengths of more than 12 and more than six genes.
Description of Parameter . | Parameter Setting . |
---|---|
Function Set | +, −, ×, / |
Number of Chromosomes | 30 |
Head Size | 12 |
Number of Genes | 6 |
Gene Size | 38 |
Linking Function | Addition |
Fitness Function | RMSE |
Program Size | 80 |
Literals | 37 |
Number of Generations | 15,000 |
Constants per Gene | 10 |
Data Type | Floating-point |
Mutation | 0.00138 |
Inversion | 0.00546 |
Gene Recombination Rate | 0.00277 |
Insertion Sequence (IS) Transposition Rate | 0.00546 |
Root Insertion Sequence (RIS) Transposition Rate | 0.00546 |
Description of Parameter . | Parameter Setting . |
---|---|
Function Set | +, −, ×, / |
Number of Chromosomes | 30 |
Head Size | 12 |
Number of Genes | 6 |
Gene Size | 38 |
Linking Function | Addition |
Fitness Function | RMSE |
Program Size | 80 |
Literals | 37 |
Number of Generations | 15,000 |
Constants per Gene | 10 |
Data Type | Floating-point |
Mutation | 0.00138 |
Inversion | 0.00546 |
Gene Recombination Rate | 0.00277 |
Insertion Sequence (IS) Transposition Rate | 0.00546 |
Root Insertion Sequence (RIS) Transposition Rate | 0.00546 |
ANN model
- 1.
- 2.
- 3.
- 4.
In order to create the structure of the neural network, ten neurons were used for the input layers, ten neurons were used for the hidden levels, and one neuron was used for the output layer. Figure 8 shows the neural network parameters like gradient, mean, and validation checks of this system, along with the alterations that occur when the system is in the training stage. As the number of generations increases, the gradient and mean decreases, and the number of validation checks increases, leading to convergence of the model. Figure 9 depicts the error monitored in the training, testing, and validation phase; the error of training fell quickly at the beginning stage and then progressively slowed down after that. The network reached convergence after 43 generations and shows the minimum value of the error.
SVM model
Statistical measures
RESULTS AND DISCUSSION
Statistical parameters . | ANN Model . | SVM Model . | GEP Model . | Naik et al. Method (2022) . | Naik & Khatua Method (2016) . | ||
---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | ||||
R2 | 0.999 | 0.990 | 0.998 | 0.996 | 0.99 | 0.99 | 0.896 |
MSE | 0.0001 | 0.0003 | 0.0002 | 0.0004 | 0.0008 | 0.0007 | 0.0019 |
RMSE | 0.003 | 0.017 | 0.013 | 0.018 | 0.028 | 0.027 | 0.043 |
MAE | 0.002 | 0.015 | 0.008 | 0.010 | 0.022 | 0.022 | 0.002 |
MAPE(%) | 0.107 | 1.074 | 0.595 | 0.598 | 1.543 | 1.546 | 2.429 |
Statistical parameters . | ANN Model . | SVM Model . | GEP Model . | Naik et al. Method (2022) . | Naik & Khatua Method (2016) . | ||
---|---|---|---|---|---|---|---|
Training . | Testing . | Training . | Testing . | ||||
R2 | 0.999 | 0.990 | 0.998 | 0.996 | 0.99 | 0.99 | 0.896 |
MSE | 0.0001 | 0.0003 | 0.0002 | 0.0004 | 0.0008 | 0.0007 | 0.0019 |
RMSE | 0.003 | 0.017 | 0.013 | 0.018 | 0.028 | 0.027 | 0.043 |
MAE | 0.002 | 0.015 | 0.008 | 0.010 | 0.022 | 0.022 | 0.002 |
MAPE(%) | 0.107 | 1.074 | 0.595 | 0.598 | 1.543 | 1.546 | 2.429 |
CONCLUSIONS
This study demonstrates the application of machine learning strategies, specifically artificial neural networks (ANN), gene-expression programming (GEP), and support vector machine (SVM), to determine the water surface profile of a nonprismatic compound with converging floodplains. The proposed models are developed based on 396 high-quality laboratory datasets with dimensionless geometric and flow parameters for nonprismatic compound channels with different converging angles (θ = 1.91° to 12.38°) and relative depths (β = 0.10 to 0.60). The following are some of the findings and inferences that may be drawn from this study.
The proposed model appears to be influenced by many parameters such as width ratio, relative flow depth, aspect ratio, converging angle, relative distance, longitudinal slope, energy slope, discharge ratio, Froude number, and Reynolds number. Flow depth rises as discharge increases up to bankfull depth, but beyond bankfull depth, a modest decrease in depth is seen at all converging angles owing to interaction and momentum transfer between the main channel and floodplains. Due to the convergence of the channel geometry, the flow depth decreases with the length of the channel, and the same tendency has been seen for greater relative depths and varied floodplain convergence angles. The nondimensional water surface profile is found to be increasing with width ratio, relative depth, discharge ratio and shows a decreasing trend with aspect ratio, energy slope, relative distance, Froude number, and Reynolds number. For all the converging angles, the same trend of variation is observed for the water surface profile in nonprismatic compound channels. The link between the nondimensional water surface profile and the nondimensional geometric and hydraulic variables of a converging compound channel is examined. It is found that there is a nonlinear relationship between all of the parameters.
In contrast to previous methods, such as Naik & Khatua (2016) and Naik et al. (2022), the developed models show better results in terms of R2, MAE, RMSE, and MAPE for various datasets. The findings demonstrated that, in accordance with the assessment criteria, all techniques (ANN, GEP, and SVM) could reasonably predict the water surface profile in nonprismatic compound channels. The ANN model showed better performance due to the highest R2 (0.999), lowest RMSE (0.003), MAE (0.002), and MAPE (0.107). However, a novel equation was developed using the GEP method for estimating the water surface profile of nonprismatic compound channels, as the GEP method also showed good statistical performance. The model's restriction is that they can only be used to forecast the water surface profile of a compound channel with a converging floodplain with uniform roughness. Future studies must be focused on estimating the water surface profile of nonprismatic compound channels with rough floodplains and new techniques.
ACKNOWLEDGEMENTS
The authors acknowledge the support from the Department of Civil Engineering, Delhi Technological University, Delhi, India.
FUNDING
Not applicable.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.