Accurate prediction of water surface profile in an open channel is the key to solving numerous critical engineering problems. The goal of the current research is to predict the water surface profile of a compound channel with converging floodplains using machine learning approaches, including gene expression programming (GEP), artificial neural networks (ANNs), and support vector machines (SVMs), in terms of both geometric and flow variables, as past studies were more focused on geometric variables. A novel equation was also proposed using gene expression programming to predict the water surface profile. In order to evaluate the performance and efficacy of these models, statistical indices are used to validate the produced models for the experimental analysis. The findings demonstrate that the suggested ANN model accurately predicted the water surface profile, with coefficient of determination (R2) of 0.999, root mean square error (RMSE) of 0.003, and mean absolute percentage error (MAPE) of 0.107%, respectively, when compared with GEP, SVM, and previously developed methods. The study confirms the application of machine learning approaches in the field of river hydraulics, and forecasting water surface profile of nonprismatic compound channels using a proposed novel equation by gene expression programming made this study unique.

  • The present study predicted the water surface profile in nonprismatic compound channels with the help of various machine learning approaches.

  • The water surface profile is found to be affected by many nondimensional geometric and flow variables.

  • The findings depict that the water surface profile predicted using ANN is in good agreement with the observed water surface profile.

Graphical Abstract

Graphical Abstract
Graphical Abstract

The following symbols are used in this paper:

B

total width of compound channel;

b

width of the main channel;

h

height of the main channel;

H

flow depth;

α

width ratio (B/b);

β

relative flow depth [(H – h)/H];

δ

aspect ratio (b/h);

θ

converging angle;

Xr

relative distance (x/L);

L

converging length;

x

distance between two consecutive sections;

So

longitudinal bed slope;

Se

energy slope;

Qr

discharge ratio (Q/Qb);

Q

discharge at any depth;

Qb

bankfull discharge;

Fr

Froude number;

Re

Reynolds number;

Ψ

nondimensional water surface profile (H/h).

Increased human settlements, buildings, and activities along river floodplains have resulted in severe repercussions during natural river floods due to the global population rise. River floods cause massive human casualties as well as economic damage. Flood catastrophes account for a third of all-natural disaster damages worldwide; flooding accounts for half of all fatalities, with trend analysis revealing that these percentages have dramatically grown (Berz 2000). Flood protection needs to predict the conveyance capacity of natural streams precisely. When water running through a channel exceeds the waterway's capacity, it results in flooding. Consequently, the requirement for precise flow parameter prediction during flood conditions to limit damage and save lives and property has piqued the interest of academics and engineers in recent years. Various methodologies and procedures have been used to aid precise measurement and forecast of river discharge, velocity distribution, shear stress distribution, and water surface level during overbank flows. Compound channels are the most common river feature during overbank flow. During the course of a river's flow, the geometry of the floodplain changes, resulting in a compound channel that is either converging or diverging. It is more challenging to replicate flow in a nonprismatic compound channel because more momentum is carried from the main channel to the floodplains. Sellin (1964), Myers & Elsawy (1975), Knight et al. (2010), and Khatua et al. (2012) have explored the flow models of straight and meandering prismatic two-stage channels, but little is known about nonprismatic compound channels. A converging channel shape causes the flow on floodplains to rise, while the flow on diverging floodplains is reduced (James & Brown 1977). Compound channels with symmetrically declining floodplains were studied by Bousmar & Zech (2002), Bousmar et al. (2004), Rezaei (2006), and Rezaei & Knight (2009) and found the extra loss of head and transfer of momentum from the main channel to floodplains. Asymmetric geometry with a greater convergence rate was examined by Proust et al. (2006). A greater convergence angle (22°) results in increased mass transfer and head loss. Chlebek et al. (2010) studied the flow behavior of skewed, two-stage converging, and diverging channels. They observed increased head losses due to the mass and momentum transfer, homogenization of the velocity on contracting floodplains, and increased velocity gradient on the expanding floodplains. Due to changes in the flow force from one subsection to another between the main channel and floodplains, there are noticeable variances in the flow distribution. In compound channels with nonprismatic floodplains, Rezaei & Knight (2011) investigated depth-averaged velocity, local velocity distributions, and boundary shear stress distributions at various convergence angles. The depth-averaged velocities show how contractions affect velocity distributions, specifically, an increase in velocity near the main channel walls, most notably in the second half of the convergence reach, and for high relative depth, the effects of the lateral flow that comes into the main channel. Yonesi et al. (2013) investigate the impact of floodplains’ roughness on overbank flow in compound channels with nonprismatic floodplains. The velocity gradient between the main channel and the floodplain in the middle and end of the divergence stretch is lowered by raising the depth ratio or lowering the roughness ratio. The gradient of velocity increased as the angle of divergence increased. The gradient of shear stress rises as the surface roughness of the floodplain increases. Naik & Khatua (2016) developed a multivariate regression model to predict the water surface profile for different compound channels with the nonprismatic floodplain using nondimensional geometric factors. The constructed model has a high level of concordance with both the empirical evidence and the findings of other scholars. In nonprismatic compound channels, the effect of flow parameters on the water surface profile and variation of flow characteristics at higher flow rates has received much less attention. As a consequence, additional experiments are being carried out at higher flow rates on nonprismatic compound channel with converging floodplains to simulate the water surface profile.

It is complicated to evaluate the connections between the components that rely on other factors and those that are independent by developing a water surface profile model using mathematical, analytical, or numerical methods. These models end up being quite cumbersome and time-consuming as they progress. Not only is the amount of time spent conducting experiments cut in half as a result, but also the amount of time spent doing labor-intensive computations is reduced. Due to the increasing reliance on machine learning algorithms to estimate flow in compound channels, these channels are increasingly calculated using support vector machines (SVM), gene expression programming (GEP), artificial neural networks (ANN), fuzzy neural networks (ANFIS), and the M5 tree decision models (Seckin 2004; Unal et al. 2010; Sahu et al. 2011; Zahiri & Azamathulla 2014; Najafzadeh & Zahiri 2015; Parsaie et al. 2017). The GEP's capacity to generate mathematical correlations distinguishes it from other soft computing approaches such as ANN and SVM (Cousin & Savic 1997; Drecourt 1999; Savic et al. 1999; Whigham & Crapper 1999, 2001; Babovic & Keijzer 2002; Karimi et al. 2015). However, river engineering using the GEP method has received far less attention (Harris et al. 2003; Giustolisi 2004; Guven & Gunal 2008; Guven & Aytek 2009; Azamathulla et al. 2013; Pradhan & Khatua 2017b). Parsaie et al. (2015) use the support vector machine (SVM) technique to predict the discharge in the compound open channel. The analysis of the error indices demonstrate that the SVM has the highest level of accuracy. Khuntia et al. (2018) developed an artificial neural network (ANN) model for predicting boundary shear stress distribution in straight compound channels using the most influential parameters such as width ratio, relative flow depth, aspect ratio, Reynolds number, and Froude number. Back-propagation neural network (BPNN) models performed well over global ranges of independent parameter values. For the estimation of discharge in diverging and converging compound channels, an equation has been created by Das et al. (2019). This equation encourages the usage of GEP. Mohanta & Patra (2021) have developed a model equation for calculating discharge in meandering compound channels, validating the use of GEP over the classic channel division technique. For meandering compound channels with relative roughness, Mohanta et al. (2021) used several AI methods, including multivariate adaptive regression splines (MARS), a group method of data handling Neural Network (GMDH-NN), and gene-expression programming (GEP), to develop model equations. Compared to GEP and MARS, the results show that the suggested GMDH-NN model accurately predicted the values. In order to simulate the flow of the Hablehroud River in north-central Iran, Esmaeili-Gisavandani et al. (2021) studied five hydrological models, including the soil and water assessment tool (SWAT), identification of unit hydrograph and component flows from rainfall, evapotranspiration, and streamflow (IHACRES), Hydrologiska Byrns Vattenbalansavdelning (HBV), Australian water balance model (AWBM), It was discovered that the calibration phase findings for SWAT, IHACRES, and HBV were good. Only the SWAT model, however, performed well and outperformed the other models throughout the validation phase. The GEP combination approach may combine model results from other, less accurate models to get a better indication of river flow. To forecast the snow depth (SD) one day in advance at the North Fork Jocko snow telemetry (SNOTEL) station in the city of Missoula, Montana State of the United States, Adib et al. (2021a) proposed a novel algorithm that combines various wavelet transform (WT) approaches, including discrete wavelet transform (DWT), maximal overlap discrete wavelet transform (MODWT), and multiresolution-based MODWT (MODWT-MRA). The findings validated wavelet-based models’ superiority over solo ones. This shows that the innovative wavelet-based model is a method that merits further investigation for its potential to deliver useful information across snow-covered areas. Adib et al. (2021b) demonstrate the use of wavelet transforms, such as discrete wavelet transform, maximum overlap discrete wavelet transforms (MODWT), multiresolution-based MODWT (MODWT-MRA), and wavelet packet transform (WP), in combination with artificial intelligence (AI)-based models, such as multi-layer perceptron's, radial basis functions, adaptive neuro-fuzzy inference systems (ANFIS), and gene expression programming, to retrieve snow depth (SD) from the national snow and ice data center. According to the findings, the WP combined with ANFIS (WP-ANFIS) performed better than the other analyzed models in terms of statistical analysis. Mohseni & Naseri (2022) used ANN and SVM to estimate the water surface profile in compound channels with vegetated floodplains. According to the results, the SVM algorithm performed better than the ANN and regression models. According to sensitivity analysis, the water surface profile depended mainly on relative discharge and relative depth. Naik et al. (2022) proposed a novel equation using GEP to predict water surface profile in converging compound channels using geometric variables.

In past studies, the water surface profile of compound channels with converging floodplains was predicted in terms of geometric variables using the GEP technique only. Therefore, this research has been conducted to predict the water surface profile of compound channels with converging floodplains in terms of both geometric and flow parameters using machine learning techniques such as GEP, ANN, and SVM. Comparisons among these techniques and approaches from other researchers are made with the help of statistical analysis to evaluate the effectiveness of the developed models for predicting the water surface profile of nonprismatic compound channels.

Data source

Numerous studies have investigated the shear force distribution, momentum transfer, and discharge methods of the flow in nonprismatic compound channels. Rezaei (2006), Naik & Khatua (2016), and the author's experimental data are used in the current work. Table 1 provides information on the experimental channel dimensions of these compound channels with various geometric properties. Figure 1 describes the flow chart of the methodology applied in the current study.
Table 1

Details of experimental channel dimensions used in the present study

Verified Test ChannelType of ChannelAngle of Convergent (θ)Longitudinal Slope (S)Cross-sectional GeometryTotal Channel Width (B) (m)Main Channel Width (b) (m)Main Channel Depth (h) (m)Converging Length (Xr) (m)Aspect ratio (δ)
Channel 1 Rezaei (2006)  Converging 11.31° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 2 Rezaei (2006)  Converging 3.81° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 3 Rezaei (2006)  Converging 1.91° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 1 Naik & Khatua (2016)  Converging 5° 0.0011 Rectangular 0.9 0.5 0.1 2.28 
Channel 2 Naik & Khatua (2016)  Converging 9° 0.0011 Rectangular 0.9 0.5 0.1 1.26 
Channel 3 Naik & Khatua (2016)  Converging 12.38° 0.0011 Rectangular 0.9 0.5 0.1 0.84 
Present Channel Converging 4° 0.001 Rectangular 1.0 0.5 0.25 3.60 
Verified Test ChannelType of ChannelAngle of Convergent (θ)Longitudinal Slope (S)Cross-sectional GeometryTotal Channel Width (B) (m)Main Channel Width (b) (m)Main Channel Depth (h) (m)Converging Length (Xr) (m)Aspect ratio (δ)
Channel 1 Rezaei (2006)  Converging 11.31° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 2 Rezaei (2006)  Converging 3.81° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 3 Rezaei (2006)  Converging 1.91° 0.002 Rectangular 1.2 0.398 0.05 7.96 
Channel 1 Naik & Khatua (2016)  Converging 5° 0.0011 Rectangular 0.9 0.5 0.1 2.28 
Channel 2 Naik & Khatua (2016)  Converging 9° 0.0011 Rectangular 0.9 0.5 0.1 1.26 
Channel 3 Naik & Khatua (2016)  Converging 12.38° 0.0011 Rectangular 0.9 0.5 0.1 0.84 
Present Channel Converging 4° 0.001 Rectangular 1.0 0.5 0.25 3.60 
Figure 1

Flow of methodology.

Figure 1

Flow of methodology.

Close modal

Experimental setup

The experiments were performed at the Hydraulics laboratory of the Department of Civil Engineering, Delhi Technological University. All experiments were conducted in a masonry flume 12 m long, 1.0 m wide, and 0.8 m deep. In this flume, a compound cross-section was constructed using brick masonry with a 0.5 m main channel wide and 0.25 m deep (Figure 2). The geometric characteristics of a two-stage channel are described in Figure 2. The converging segment of the channel was built with the help of brick masonry, having a converging angle of θ = 4°. The compound channel has a prismatic section of 6 m long, a nonprismatic section of length of 3.6 m, and the rest is the downstream portion. The flume was run for six different flow rates. For each discharge, various flow characteristics, such as stage-discharge relationship, water surface profile, velocity distribution, shear stress distribution, etc., were measured in the prismatic section (PS) and various nonprismatic sections (NPS), as shown in Figure 3. NPS1 and NPS5 are the sections at the start and end of the converging portion, whereas NPS3 is the middle of the converging section. The NPS2 represents the middle section between NPS1 and NPS3. Similarly, NPS4 represents the middle section between NPS3 and NPS5.
Figure 2

Cross-section of a two-stage channel.

Figure 2

Cross-section of a two-stage channel.

Close modal
Figure 3

Experimental setup.

Figure 3

Experimental setup.

Close modal
The subcritical flow regime was attained in several conditions of the two-stage channel with a longitudinal bed slope of 0.001. Based on data collected from in-bank and over-bank flows in the floodplains and main channel, Manning's n value was estimated. The system derives the water supply from an underground sump to an overhead tank in the experimental channel. The water from the channel is collected in a volumetric tank outfitted with a v-notch. This v-notch was calibrated to measure the discharge from the experimental channel. After that, it makes the flow back to the sump located underneath. Figure 3 represents the experimental setup and the types of equipment that were employed in the research. Figure 4 depicts a plan view of the nonprismatic cross-sections of Rezaei (2006), Naik & Khatua (2016), and the current channel. At the downstream end of the flume, a tailgate was installed to control the water surface profile and impose a specific flow depth in the flume portion. A point gauge with 0.1 mm precision was used to measure the water surface profile at a distance of 1.0 m and 0.3 m in the prismatic and nonprismatic portions, respectively. An acoustic doppler velocimeter (ADV) was used to detect the average velocity of the cross-section and three-dimensional velocity distributions along the wetted perimeter at 2.5 cm and 10 cm vertical and horizontal intervals, respectively, as shown in grid form in Figure 2. The data collected using the ADV were filtered using the Horizon ADV software. The lateral distributions of boundary shear stress were also measured using a Preston tube of 5 mm outer diameter at the same sections where the velocity distributions were tested. A digital manometer was used to measure the pressure difference. After that, Patel's (1965) calibration equations were used for calculating the shear stress values.
Figure 4

Nonprismatic sections of (a) Rezaei (2006) (b) Naik & Khatua (2016) (c) Present channel.

Figure 4

Nonprismatic sections of (a) Rezaei (2006) (b) Naik & Khatua (2016) (c) Present channel.

Close modal

Theoretical background

River engineers need to make an accurate forecast of the water surface profile in overbank flows in order to successfully build drainage canals, flood defense systems, river training works, floodplain management, and other similar projects. It may be challenging to collect field data that is both sufficiently exact and comprehensive in natural rivers, especially when flood flow conditions are variable. Experiments conducted in a laboratory are essential in order to improve one's comprehension related to the water surface profile of compound channels that include both prismatic and nonprismatic floodplains. For the purpose of predicting the water surface profile of a converging compound channel, Naik & Khatua (2016) used multivariate analysis to suggest Equations (1)–(21), while Naik et al. (2022) used the GEP technique to offer Equation (22).
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)

The water surface profile of the compound channel with different converging floodplains was attempted to be predicted. The flow is found to be non-uniform in the nonprismatic section, contrary to be anticipated until it reaches the prismatic zone. These channels have a uniformly smooth surface on their main channel and floodplains. Manning's n values for all of these smooth surfaces are found to be 0.013. The majority of influencing factors, including width ratio (α), relative depth ratio (β), aspect ratio (δ), converging angle (θ), relative distance (Xr), longitudinal slope (So), energy slope (Se), discharge ratio (Qr), Froude's number (Fr) and Reynold's number (Re), are taken into consideration while estimating the water surface profile of nonprismatic compound channels. A number of parameters are used to enable the model equation to be applied to diverse compound channels.

The necessary dimensionless equation may be expressed as follows:
(23)

GEP model

Ferreira first put up the idea for the GEP approach in 1999. Genetic programming (GP) and genetic algorithms (GA) are brought together in this approach. The plain and linear chromosomes are joined with structures of fixed length and branching that vary in size and form (similar to the parse tree in genetic programming). Phenotype and genotype may be distinguished using this methodology due to the fact that all branch structures, despite their varying dimensions and configurations, are recorded in linear chromosomes of a predetermined length. GEP is a multigene, one-of-a-kind coding language that permits the alteration of increasingly sophisticated equations by dividing them into many sub-equations. Gene generations, fitness-based gene selection, and the introduction of genetic diversity via using one or more genetic operators are also utilized in this process. The author develops an equation for the nondimensional water surface profile parameter (Ψ) of a non-prismatic compound channel by using a GEP technique with the assistance of GeneXproTools 5.0 (2014). The development of models is decided upon according to the appropriateness of the datasets used for training and testing. The selected models are recreated in GEP using one or more genetic operators, such as mutation or recombination. Previous studies provide a concise conceptual outline of GEP in their descriptions (Mallick et al. 2020). The relationship (Equation (23)) illustrates the water surface profile of nonprismatic compound channels varies as a function of the geometric and hydraulic factors. In this study, the modeling procedure uses Ψ as the target value and the ten independent factors (α, β, δ, θ, Xr, So, Se, Qr, Fr, Re) as input variables discussed in Equation (23). The structure of the model is built with the help of the four basic operators of arithmetic (+, −, ×, /). In all, 396 data sets are used, and they are dispersed at random over the two distinct stages of the modeling process. For the purpose of the present investigation, training makes use of 70% of the data, while testing makes use of the remaining 30%. In this investigation, the root mean squared error (RMSE) served as the fitness function (Ei), and the fitness (fi) was determined using Equation (24), which defines the total sum of all errors relative to the goal value. The initial model was developed using only one gene and two different head sizes as its starting point. The number of genes and heads was then gradually raised during each run, and the results of the training and testing datasets were recorded after each iteration. There was not a significant improvement in performance between the training data phase and the testing data phase for head lengths of more than 12 and more than six genes.

As a consequence of this, 12 were selected as head length to be included in the GEP model, and six genes were assigned to each chromosome. The addition served as the connecting function that was used to connect the genes. After 15,000 generations, the value of the fitness function and the coefficient of determination of the training and testing data had not changed, which was suggestive of the fact that generational progress may have reached its conclusion. Table 2 summarizes the primary characteristics that impact the success of GEP modeling and are used in the construction of a model for predicting the water surface profile of compound channels with converging floodplains. Figure 5 represents the GEP model for the water surface profile as an expression tree (ET). Within this representation, the input variables are denoted by d0 to d9, and the constant value for gene two is denoted by G2c0. In order to decode this expression tree, an algebraic equation (Equation (24)) was developed that links the output variables to the input variables.
Table 2

The GEP model's selection criteria and parameters

Description of ParameterParameter Setting
Function Set +, −, ×, / 
Number of Chromosomes 30 
Head Size 12 
Number of Genes 
Gene Size 38 
Linking Function Addition 
Fitness Function RMSE 
Program Size 80 
Literals 37 
Number of Generations 15,000 
Constants per Gene 10 
Data Type Floating-point 
Mutation 0.00138 
Inversion 0.00546 
Gene Recombination Rate 0.00277 
Insertion Sequence (IS) Transposition Rate 0.00546 
Root Insertion Sequence (RIS) Transposition Rate 0.00546 
Description of ParameterParameter Setting
Function Set +, −, ×, / 
Number of Chromosomes 30 
Head Size 12 
Number of Genes 
Gene Size 38 
Linking Function Addition 
Fitness Function RMSE 
Program Size 80 
Literals 37 
Number of Generations 15,000 
Constants per Gene 10 
Data Type Floating-point 
Mutation 0.00138 
Inversion 0.00546 
Gene Recombination Rate 0.00277 
Insertion Sequence (IS) Transposition Rate 0.00546 
Root Insertion Sequence (RIS) Transposition Rate 0.00546 
Figure 5

GEP Expression Tree.

Figure 5

GEP Expression Tree.

Close modal
In terms of the analytical form, the GEP is modeled according to the following:
(24)

ANN model

An artificial neural network (ANN) is a significant computing approach that is making rapid strides in development. An ANN is the association between easily handled components known as neurons or nodes, which are arranged in a layered method. Figure 6 represents the architecture of an ANN, which is a back-propagation (BP) neural network model with a feedforward design with three layers, consisting of I input neurons, m hidden neurons, and n output neurons. The data enters the network via the input layer. The hidden layer takes the data from the input layer, which is responsible for all the data processing, and the output layer receives data that has been processed from the network. The data from the output are sent to receptors on the outside. In an ANN, layers are connected to the resultant layers by means of interconnections between layers. These interconnections are referred to as weights and weighted values, respectively. To restrict the behavior of specified cost functions in an ANN, the weights of the interconnections are increased (Khuntia et al. 2018).
Figure 6

ANN architecture.

Figure 6

ANN architecture.

Close modal
In this study, an ANN method was used to predict the water surface profile of a compound channel with converging floodplains. For prediction purposes, a feedforward network is built on MATLAB R (2019) using a backpropagation training technique. Figure 7 depicts the simulation process that the ANN goes through when it is operating inside the network. The network must first be trained before it can be used for any problem. During this phase, the weights and biases of each output neuron are adjusted in accordance with the training procedure so that the goal output of each output neuron is constrained. Training in ANNs is composed of three different components: weights between neurons, which characterize the relative significance of the sources of input, a sigmoid transfer function, which controls the stage of the output from a neuron and an arrangement of learning laws, which depicts how the weights are changed throughout the training process. During training, a nonlinear function, which is most often a sigmoid function, is used (Govindaraju 2000a).
(25)
where a = sum of weighted input value plus bias. The result is sent to the subsequent layer of nodes to be processed. There are four stages in feedforward back-propagation neural network (BPNN) approaches. These processes are as follows:
  • 1.
    Sum the weighted input
    (26)
    where Nodz = sum for the zth hidden node, n = total number of input nodes, Wxz = connection weight between the xth input and the zth hidden node, kz = normalized input at the xth input node, and ∈z = bias value at the zth hidden node.
  • 2.
    Transform the weighted input
    (27)
    where Outz = output from the zth hidden node.
  • 3.
    Sum the hidden node output
    (28)
    where Nody = sum of yth output node, m = total number of hidden nodes, Wzy = connection weights between the zth hidden node and the yth output node, and ∈y = bias value at the yth output node.
  • 4.
    Transform the weighted sum
    (29)
    where Outy = output at the yth output node.

    In order to create the structure of the neural network, ten neurons were used for the input layers, ten neurons were used for the hidden levels, and one neuron was used for the output layer. Figure 8 shows the neural network parameters like gradient, mean, and validation checks of this system, along with the alterations that occur when the system is in the training stage. As the number of generations increases, the gradient and mean decreases, and the number of validation checks increases, leading to convergence of the model. Figure 9 depicts the error monitored in the training, testing, and validation phase; the error of training fell quickly at the beginning stage and then progressively slowed down after that. The network reached convergence after 43 generations and shows the minimum value of the error.

Figure 7

Simulation process of ANN model.

Figure 7

Simulation process of ANN model.

Close modal
Figure 8

Training parameters during ANN modeling.

Figure 8

Training parameters during ANN modeling.

Close modal
Figure 9

Convergence curve for the training of BPNN.

Figure 9

Convergence curve for the training of BPNN.

Close modal

SVM model

SVMs, or support vector machines, are a group of similar supervised learning algorithms that may be used for classification and regression. A non-linear classifier offers superior accuracy in many different types of applications. First, in SVR, the input x is mapped into an m-dimensional feature space using a fixed (nonlinear) mapping. Next, a linear model is formed in this feature space utilizing the information obtained from the previous step (Parsaie et al. 2015). The naive way of making a non-linear classifier out of a linear classifier is to map our data from the input space X to a feature space F using a non-linear function φ : x f . In the space F the discriminate function is:
(30)
The linear model, denoted by the mathematical notation f (x, w), may be represented in the feature space as follows:
(31)
(32)
(33)
In the feature space, F, this expression takes the form:
(34)
(35)
(36)
Since SVM contains a large number of kernel functions, determining how to choose an effective kernel function is another research challenge. On the other hand, there are other helpful kernel functions that may be used in general.
where C, γ, r, and d are kernel parameters. It is common that the SVM generalization performance (estimation accuracy) is directly proportional to the quality of the settings for the meta-parameters C, γ, and r, as well as the kernel parameters. The complexity of the prediction (regression) model is controlled by the values chosen for C, γ, and r. The issue of optimum parameter selection is even more difficult because the complexity of the SVM model (and, therefore, its generalization performance) is dependent on all three parameters. In order to carry out the classification, kernel functions are called upon to modify the dimensionality of the input space. Data sets are used to develop the SVM model, which is comparable to the development of other neural network models. To accomplish this goal, a total of 396 data sets about the water surface profile of converging compound channels were used. In order to construct the SVM model, the acquired data is split into two categories in MATLAB R (2019) software: the training data and the testing data. Equation (23) demonstrates the primary factors that affect the profile of the water surface.

Statistical measures

For further testing of the accuracy of the developed models by GEP, ANN, and SVM approach, various types of error analysis, such as the coefficient of determination (R2), mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean squared error (RMSE), are analyzed using the following equations (Naik et al. 2022).
(37)
(38)
(39)
(40)
where a and p are the actual and predicted values, respectively, ā and are the mean of actual and predicted values, respectively, and N is the number of datasets.
Figure 10 represents the stage-discharge relationship for prismatic and nonprismatic sections of the nonprismatic compound channel with a converging angle of θ = 4°. The flow depth rises as the discharge increases. However, there is a minor fall in increment beyond bankfull depth due to interaction and additional momentum transfer between the main channel and floodplains. Due to the convergence of channel geometry, flow depth reduces for the same discharge in the converging portion from section 1 to section 5. Therefore, for in-bank and overbank flow in prismatic and nonprismatic parts, the best-fitted trend lines are found to be a polynomial function with high R2 values. Figures 11 depict the water surface profile of the nonprismatic compound channel with the longitudinal distance for different relative flow depths. In the prismatic part of the flume, the water surface profile stays the same, but in the converging portion of the flume, there is a decreasing water level due to the flow acceleration (especially in the second half of the transition), and in the downstream part of the flume, the flow is nearly uniform with some fluctuations. Figure 12 shows the variation of the nondimensional water surface profile (Ψ) with nondimensional geometric and flow parameters such as width ratio, relative depth, aspect ratio, relative distance, energy slope, discharge ratio, Froude number, and Reynolds number for converging angle θ = 4° and relative depths ranging from β = 0.10 to 0.60, respectively. The water surface profile increases as the width ratio increases due to a rise in stage for a particular width ratio. For different converging angles, the shape of the water's surface follows the same pattern of change with respect to the width ratio. The water surface profile increases non-linearly as the relative flow depths rises. Along the several parts of the nonprismatic compound channel, the water surface profile follows a pattern of variation with an aspect ratio that decreases as it moves from section to section. As demonstrated in the figure, the relative distance between two points causes the impact of relative distance on nondimensional water surface profiles to decrease as the relative distance between the two points increases. This indicates that a converging transition rapidly increases the velocity head while simultaneously lowering the potential head. The drop is quite precipitous at the more acute angles of convergence of the floodplain. The water surface profile decreases as the energy slope increases due to the loss of energy along the flow length. It increases linearly with the discharge ratio due to a rise in flow depth with the increase of flow rate. The same pattern of variation is observed for all the relative depths; however, the decline is found to be steeper for higher converging angles due to more flow resistance from the channel convergence. The water surface elevation decreases as the Froude number, and Reynolds number increases with the same trend of variation for the different relative depths. The rise in velocity due to the convergence of channel geometry leads to flow acceleration in the nonprismatic sections, causing the water surface profile to decline.
Figure 10

Stage-discharge relationship for the nonprismatic compound channel (a) prismatic section (b) nonprismatic sections.

Figure 10

Stage-discharge relationship for the nonprismatic compound channel (a) prismatic section (b) nonprismatic sections.

Close modal
Figure 11

Water surface profile for nonprismatic compound channel for different relative flow depths.

Figure 11

Water surface profile for nonprismatic compound channel for different relative flow depths.

Close modal
Figure 12

Variation of nondimensional water surface profile with various parameters (a) width ratio (b) relative depth (c) aspect ratio (d) relative distance (e) energy slope (f) discharge ratio (g) Froude number (h) Reynolds number.

Figure 12

Variation of nondimensional water surface profile with various parameters (a) width ratio (b) relative depth (c) aspect ratio (d) relative distance (e) energy slope (f) discharge ratio (g) Froude number (h) Reynolds number.

Close modal
The scatter plots comparing the predicted and observed Ψ values for each of the different ML techniques are shown in Figure 13. The fact that the values are so near to the line indicating good agreement, is a strong indicator of the generated GEP, ANN, and SVM models’ capacity to make accurate predictions. Among all the three models, ANN predicted values are very close to the best-fitted line compared to GEP and SVM, having scatter values along the fitted line. Therefore, the ANN model best agrees with the experimental data with high R2 values. Figure 14 compares developed models for predicting water surface profile with the previous methods developed by Naik et al. (2022) and Naik & Khatua (2016). It has been noticed that the constructed models are very close to the best-fitted line compared to previous methods and have a strong potential for generalization. They do not exhibit any signs of the phenomenon of overtraining. The effectiveness of the GEP, ANN, and SVM models was judged based on a number of statistical parameters, including R2, MSE, RMSE, MAE, and MAPE, which are shown in Table 3. From the table, it can observe that ANN (R2 = 0.999 and RMSE = 0.003) gives the best predicting results for the water surface profile of nonprismatic compound channels as compared to other models such as GEP (R2 = 0.998 and RMSE = 0.013 in training, R2 = 0.996 and RMSE = 0.018 in testing), SVM (R2 = 0.990 and RMSE = 0.017) and other previously developed methodologies. ANN model with MAPE of 0.107 proves to be the most suitable method as compared to other methods in the prediction of water surface profile in nonprismatic compound channels. To reduce the losses due to flood, understanding the flow mechanism in prismatic and nonprismatic (additional momentum transfer should also account in flow modeling) reaches of the river is important in designing flood control and diversion structures. The models developed in the study can have a practical application to nonprismatic rivers such as the River Main in Northern Ireland, the Brahmaputra River in India, and other similar rivers. The findings of the study will be useful in the design of such structures and thereby reducing economic as well as human losses.
Table 3

Error analysis of predicted Ψ by various approaches

Statistical parametersANN ModelSVM ModelGEP Model
Naik et al. Method (2022)
Naik & Khatua Method (2016)
TrainingTestingTrainingTesting
R2 0.999 0.990 0.998 0.996 0.99 0.99 0.896 
MSE 0.0001 0.0003 0.0002 0.0004 0.0008 0.0007 0.0019 
RMSE 0.003 0.017 0.013 0.018 0.028 0.027 0.043 
MAE 0.002 0.015 0.008 0.010 0.022 0.022 0.002 
MAPE(%) 0.107 1.074 0.595 0.598 1.543 1.546 2.429 
Statistical parametersANN ModelSVM ModelGEP Model
Naik et al. Method (2022)
Naik & Khatua Method (2016)
TrainingTestingTrainingTesting
R2 0.999 0.990 0.998 0.996 0.99 0.99 0.896 
MSE 0.0001 0.0003 0.0002 0.0004 0.0008 0.0007 0.0019 
RMSE 0.003 0.017 0.013 0.018 0.028 0.027 0.043 
MAE 0.002 0.015 0.008 0.010 0.022 0.022 0.002 
MAPE(%) 0.107 1.074 0.595 0.598 1.543 1.546 2.429 
Figure 13

Scatter plots of observed and predicted Ψ for various ML models (a) GEP (b) ANN (c) SVM.

Figure 13

Scatter plots of observed and predicted Ψ for various ML models (a) GEP (b) ANN (c) SVM.

Close modal
Figure 14

Comparison of predicted value of Ψ for different models.

Figure 14

Comparison of predicted value of Ψ for different models.

Close modal

This study demonstrates the application of machine learning strategies, specifically artificial neural networks (ANN), gene-expression programming (GEP), and support vector machine (SVM), to determine the water surface profile of a nonprismatic compound with converging floodplains. The proposed models are developed based on 396 high-quality laboratory datasets with dimensionless geometric and flow parameters for nonprismatic compound channels with different converging angles (θ = 1.91° to 12.38°) and relative depths (β = 0.10 to 0.60). The following are some of the findings and inferences that may be drawn from this study.

The proposed model appears to be influenced by many parameters such as width ratio, relative flow depth, aspect ratio, converging angle, relative distance, longitudinal slope, energy slope, discharge ratio, Froude number, and Reynolds number. Flow depth rises as discharge increases up to bankfull depth, but beyond bankfull depth, a modest decrease in depth is seen at all converging angles owing to interaction and momentum transfer between the main channel and floodplains. Due to the convergence of the channel geometry, the flow depth decreases with the length of the channel, and the same tendency has been seen for greater relative depths and varied floodplain convergence angles. The nondimensional water surface profile is found to be increasing with width ratio, relative depth, discharge ratio and shows a decreasing trend with aspect ratio, energy slope, relative distance, Froude number, and Reynolds number. For all the converging angles, the same trend of variation is observed for the water surface profile in nonprismatic compound channels. The link between the nondimensional water surface profile and the nondimensional geometric and hydraulic variables of a converging compound channel is examined. It is found that there is a nonlinear relationship between all of the parameters.

In contrast to previous methods, such as Naik & Khatua (2016) and Naik et al. (2022), the developed models show better results in terms of R2, MAE, RMSE, and MAPE for various datasets. The findings demonstrated that, in accordance with the assessment criteria, all techniques (ANN, GEP, and SVM) could reasonably predict the water surface profile in nonprismatic compound channels. The ANN model showed better performance due to the highest R2 (0.999), lowest RMSE (0.003), MAE (0.002), and MAPE (0.107). However, a novel equation was developed using the GEP method for estimating the water surface profile of nonprismatic compound channels, as the GEP method also showed good statistical performance. The model's restriction is that they can only be used to forecast the water surface profile of a compound channel with a converging floodplain with uniform roughness. Future studies must be focused on estimating the water surface profile of nonprismatic compound channels with rough floodplains and new techniques.

The authors acknowledge the support from the Department of Civil Engineering, Delhi Technological University, Delhi, India.

Not applicable.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Adib
A.
,
Zaerpour
A.
,
Kisi
O.
&
Lotfirad
M.
2021b
A rigorous wavelet-packet transform to retrieve snow depth from SSMIS data and evaluation of its reliability by uncertainty parameters
.
Water Resour. Manage.
35
(
9
),
2723
2740
.
https://doi.org/10.1007/s11269-021-02863-x
.
Babovic
V.
&
Keijzer
M.
2002
Rainfall runoff modelling based on genetic programming
.
Hydrol. Res.
33
(
5
),
331
346
.
Berz
G.
2000
Flood disasters: lessons from the pastworries for the future
.
Proc. Inst. Civ. Eng. Water Marit. Energy
142
(
1
),
3
8
.
https://doi.org/10.1680/wame.2000.142.1.3
.
Bousmar
D.
&
Zech
Y.
2002
Periodical turbulent structures in compound channels
. In:
River Flow International Conference on Fluvial Hydraulics
,
Louvain-la-Neuve, Belgium
, pp.
177
185
.
Bousmar
D.
,
Wilkin
N.
,
Jacquemart
J. H.
&
Zech
Y.
2004
Overbank flow in symmetrically narrowing floodplains
.
J. Hydraul. Eng. ASCE
130
(
4
),
305
312
.
Chlebek
J.
,
Bousmar
D.
,
Knight
D. W.
&
Sterling
M. A.
2010
Comparison of overbank flow conditions in skewed and converging/diverging channels
. In
River Flows International Conference
, pp.
503
511
.
Cousin
N.
&
Savic
D. A.
1997
A Rainfall-Runoff Model Using Genetic Programming. Centre for Systems and Control Engineering, Rep. No. 97, 3
.
Das
B. S.
,
Devi
K.
&
Khatua
K. K.
2019
Prediction of discharge in converging and diverging compound channel by gene expression programming
.
J. Hydraul. Eng.
doi:10.1080/09715010.2018.1558116
.
Drecourt
J. P.
1999
Application of neural networks and genetic programming to rainfall runoff modeling
.
Water Resour. Manage.
13
(
3
),
219
231
.
Esmaeili-Gisavandani
H.
,
Lotfirad
M.
,
Sofla
M. S. D.
&
Ashrafzadeh
A.
2021
Improving the performance of rainfall-runoff models using the gene expression programming approach
.
J. Water Clim. Change
12
(
7
),
3308
3329
.
https://doi.org/10.2166/wcc.2021.064
.
Gepsoft, GeneXproTools 5.0
2014
Data Modeling & Analysis Software. (n.d.). https://www.gepsoft.com/
.
Govindaraju
R. S.
2000a
Artificial neural networks in hydrology. I: preliminary concepts
.
J. Hydrol. Eng.
5
(
2
),
115
123
.
Harris
E. L.
,
Babovic
V.
&
Falconer
R. A.
2003
Velocity predictions in compound channels with vegetated floodplains using genetic programming
.
Int. J. River Basin Manage.
1
(
2
),
117
123
.
James
M.
&
Brown
R. J.
1977
Geometric parameters that influence floodplain flow
. In
U.S. Army Engineer Waterways Experimental Station
,
June
,
Vicksburg Miss.
Research report H-77
.
Karimi
S.
,
Shiri
J.
,
Kisi
O.
&
Shiri
A. A.
2015
Short-term and long-term streamflow prediction by using ‘wavelet-gene expression’ programming approach
.
ISH J. Hydraul. Eng.
22
(
2
),
148
162
.
Khatua
K. K.
,
Patra
K. C.
&
Mohanty
P. K.
2012
Stage-discharge prediction for straight and smooth compound channels with wide floodplains
.
J. Hydraul. Eng. ASCE
138
(
1
),
93
99
.
Khuntia
J. R.
,
Devi
K.
&
Khatua
K. K.
2018
Boundary shear stress distribution in straight compound channel flow using artificial neural network
.
J. Hydrol. Eng.
23
(
5
),
04018014
.
doi:10.1061/(asce)he.1943-5584.0001651
.
Knight
D. W.
,
Tang
X.
,
Sterling
M.
,
Shiono
K.
&
McGahey
C.
2010
Solving open channel flow problems with a simple lateral distribution model
.
River Flow
1
,
41
48
.
MATLAB R
2019
[Computer Software]
.
MathWorks
,
Natick, MA
.
Mohanta
A.
&
Patra
K. C.
2021
Gene-expression programming for calculating discharge in meandering compound channels
.
Sustainable Water Resour. Manage.
7
,
33
.
https://doi.org/10.1007/s40899-021-00504-0
.
Mohanta
A.
,
Pradhan
A.
,
Mallick
M.
&
Patra
K. C.
2021
Assessment of shear stress distribution in meandering compound channels with differential roughness through various artificial intelligence approach
.
Water Resour. Manage.
35
(
13
),
4535
4559
.
https://doi.org/10.1007/s11269-021-02966-5
.
Mohseni
M.
&
Naseri
A.
2022
Water surface profile prediction in compound channels with vegetated floodplains
.
Proceedings of the Institution of Civil Engineers - Water Management, 1–12. https://doi.org/10.1680/jwama.21.00005
.
Myers
W. R. C.
&
Elsawy
E. M.
1975
Boundary shears in channel with flood plain
.
J. Hydraul. Div. ASCE
101
(
7
),
933
946
.
Naik
B.
&
Khatua
K. K.
2016
Water surface profile computation for compound channels with narrow flood plains
.
Arabian J. Sci. Eng.
42
(
3
),
941
955
.
doi:10.1007/s13369-016-2236-x
.
Naik
B.
,
Kaushik
V.
&
Kumar
M.
2022
Water surface profile in converging compound channel using gene expression programming
.
Water Supply
22
(
5
),
5221
5236
.
https://doi.org/10.2166/ws.2022.172
.
Parsaie
A.
,
Yonesi
H. A.
&
Najafian
S.
2015
Predictive modeling of discharge in compound open channel by support vector machine technique
.
Model. Earth Syst. Environ.
1
(
1–2
).
doi:10.1007/s40808-015-0002-9
.
Patel
V. C.
1965
Calibration of the Preston tube and limitations on its use in pressure gradients
.
J. Fluid Mech.
231
,
85
208
.
Pradhan
A.
&
Khatua
K. K.
2017b
Gene expression programming to predict Manning's n in meandering flows
.
Can. J. Civ. Eng.
45
(
4
),
304
313
.
Proust
S.
,
Rivière
N.
,
Bousmar
D.
,
Paquier
A.
&
Zech
Y.
2006
Flow in the compound channel with abrupt floodplain contraction
.
J. Hydraul. Eng.
132
(
9
),
958
970
.
Rezaei
B.
2006
Overbank Flow in Compound Channels with Prismatic and non-Prismatic Floodplains
.
Ph.D. Thesis
,
University of Birmingham
,
Birmingham
,
UK
.
Sahu
M.
,
Khatua
K. K.
&
Mahapatra
S. S.
2011
A neural network approach for prediction of discharge in straight compound open channel flow
.
Flow Meas. Instrum.
22
(
5
),
438
446
.
Savic
D. A.
,
Walters
G. A.
&
Davidson
J. W.
1999
A genetic programming approach to rainfall-runoff modelling
.
Water Resour. Manage.
13
(
3
),
219
231
.
Whigham
P. A.
&
Crapper
P. F.
1999
Time series modelling using genetic programming: an application to rainfall-runoff models
.
Adv. Genet. Program
3
,
89
104
.
Whigham
P. A.
&
Crapper
P. F.
2001
Modelling rainfall-runoff using genetic programming
.
Math. Comput. Modell.
33
(
6–7
),
707
721
.
Yonesi
H. A.
,
Omid
M. H.
&
Ayyoubzadeh
S. A.
2013
The hydraulics of flow in nonprismatic compound channels
.
J. Civ. Eng. Urbanism
3
(
6
),
342
356
.
https://doi.org/10.1061/(ASCE)0733-9429(2000)126:4(299)
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).