River discharge estimation is vital for effective flood management and infrastructure planning. River systems consist of a main channel and floodplains, collectively forming a compound channel, posing challenges in discharge calculation, particularly when floodplains converge or diverge. In the present study, ML algorithms such as XGBoost, CatBoost, and LightGBM were developed to predict discharge in a compound channel. PSO algorithm is applied for optimization of hyperparameters of gradient boosting models, denoted as PSO-XGBoost, PSO-LightGBM, and PSO-CatBoost. ML model discharge predictions were validated with existing empirical models and feature importance was explored using SHAP and sensitivity analysis. Results show that all three gradient-boosting algorithms effectively predict discharge in compound channels and are further enhanced by application of PSO algorithm. The R2 values for XGBoost, PSO-XGBoost, CatBoost, and PSO-CatBoost exceed 0.95, whereas they are above 0.85 for LightBoost and PSO-LightBoost. PSO-CatBoost performance is better than other models based on findings of statistical performance parameters, uncertainty analysis, reliability index, and resilience index for prediction of discharge in a compound channel with converging and diverging flood plains. The findings of this study validate the suitability of the proposed models especially optimized with PSO is recommended for predicting discharge in a compound channel.

  • A diverse range of datasets is used for discharge prediction in non-prismatic compound channels.

  • The SHAP and sensitivity analysis is employed to find out the effect of influencing input parameters that affect discharge prediction.

  • Six ML models have been proposed for the prediction of discharge.

  • Recommendation of the different models for range-wise variation of width ratio and relative flow depth.

Qfb

rate of flow of floodplain

Q

total discharge in the compound channel

Qmc

main channel discharge

So

bed slope of the channel

n

Manning's coefficient

h

bank full depth of the main channel

B

width of the main channel

Fmc

main channel friction factor

Ffp

floodplain friction factor

Ff

relative friction factor

H

total flow depth over the main channel

AR

area ratio

Rr

relative hydraulic radius

XR

relative longitudinal distance

α

width ratio

β

relative flow depth

δ*

flow aspect ratio of the main channel

θ

angle of convergence and divergence

R2

coefficient of determination

L

length of the main channel

ANFIS

adaptive neuro-fuzzy interface systems

ANN

artificial neural networks

CatBoost

categorical boosting

DCM

Divided Channel Method

GBDT

Gradient Boosting Decision Trees

GEP

gene expression programming

LightGBM

light gradient boosting machine

MAPE

mean absolute percentage error

MARS

multivariate adaptive regression splines

ML

machine learning

PSO

particle swarm optimisation

RMSE

root mean square error

SHAP

SHapley Additive exPlanations

XGBoost

eXtreme Gradient Boosting

Throughout history, civilisations have established settlements close to rivers due to the advantageous combination of fertile arable land and convenient access to water resources for a range of human activities. In nature, most of the rivers are in the form of compound channels because of a combination of natural geomorphological processes and human settlement. Due to the continuous establishment of settlements along the riverbanks, the width of the river changes, becoming narrower in certain areas and wider in others. This results in a varying or fluctuating shape of the floodplain, known as non-prismatic floodplain. Non-prismatic floodplain in combination with the main channel forms a non-prismatic compound channel. During a flood water level rises and overflows the river bank and inundates nearby floodplains; therefore, the prediction of river discharge is crucial for flood defence management, water resource management, environmental protection, hydropower generation, infrastructure design and different research purposes.

During floods, estimating discharge becomes highly challenging, especially in the case of compound channel flow. Numerous researchers have endeavoured to develop empirical models for predicting discharge in prismatic compound channels. Among these approaches, the Single Channel Method (SCM) simplifies the entire section as uniform with a single roughness factor, often resulting in an overestimation of discharge. To address this limitation, researchers have explored various Divided Channel Methods (DCMs), which consider different interface planes originating from the junction regions between the main channel and floodplain. These methods, such as those proposed by Knight & Demetriou (1983), Myers (1987), Patra & Khatua (2006) and Khatua et al. (2012), aim to divide the section into subareas, thereby accounting for variations in apparent shear values. A novel approach, based on the Divided Channel Method (DCM), was introduced by Lambert & Myers (1998), wherein a correction weighting factor is applied to the computational velocity magnitude based on division subsections along vertical and horizontal lines. In the Weighted Divided Channel Method (WDCM), it is assumed that momentum exchange and apparent shear stress quantities are known on each imaginary boundary. This method calculates discharge by multiplying the mean velocity at each subsection by the corresponding area, determined from the vertical division line.

In quantifying momentum transfer, Khatua et al. (2012) emphasised the importance of including/excluding interface length, which significantly affected discharge predictions for their channel. While the statistical analysis of flow data is occasionally used for river discharge forecasting, its practical application in real-world scenarios is limited. However, it can be employed to predict intended flow behaviour (Kilinc et al. 2000). Hydrodynamic and numerical models have started to replace conventional methods for modelling river floods, offering simulation capabilities. Nevertheless, these models are complex due to the multitude of factors requiring consideration in selecting the appropriate model for predicting river floods (Shrestha 2005). The Exchange Discharge Method (EDM), a 1D method for estimating discharge in compound channels, was introduced by Bousmar & Zech (1999), requiring two parameters: the turbulence exchange factor and the geometrical exchange correction factor. Abril & Knight (2004) anticipated the stage–discharge relationship using a depth-averaged method and finite element analysis, necessitating the calibration of three parameters: mean velocity, transverse eddy viscosity and local flow resistance coefficient. Liao & Knight (2007) used an analytical formula to determine the stage–discharge relationship in straight trapezoidal open channels, finding a good coincidence between analytical and experimental rating curves, albeit with the primary drawback of algebraic complexity when splitting the cross-section into several panels. Maghrebi (2006) employed a Single Point of Velocity Measurement (SPM) to calculate discharge in a rectangular laboratory flume, validating the model using data from the UK's Severn River, which demonstrated the method's reliability. Sun & Shiono (2009) estimated rating curves by employing an experimental model in straight compound channels, with and without one-line emergent vegetation along the floodplain boundary, developing novel friction factor formulas based on flow characteristics and vegetation density for both scenarios. Al-Khatib et al. (2013) computed discharge in compound symmetrical channels using data from multiple experiments and regression analysis. Yang et al. (2014) applied the momentum transfer coefficient to forecast discharge distributions of floodplains and the main channel. Fernandes et al. (2015) estimated the flow in the compound open channels using different stage–discharge predictors. Devi et al. (2021) developed an analytical method for predicting depth-averaged velocity in prismatic compound channels, while Das et al. (2022) proposed a multivariable regression model for predicting discharge across different sections of diverging compound channels.

Over the last six decades, many authors have investigated the flow of compound channels (Sellin 1964; Wormleaton et al. 1982; Knight & Demetriou 1983; Knight et al. 1989; Najafzadeh & Zahiri 2015; Devi et al. 2016). A few authors have investigated the non-prismatic compound channel. The very first investigation in skewed angles was conducted by James & Brown (1977) with diverse skew angles and observed that if the main channel is converging it increases the water level in the floodplain and when the main channel is diverging it reduces the water level in floodplain. Chlebek (2009) analysed the consequence of energy slope and behaviour in skewed non-prismatic compound channels. Bousmar (2002) was the first to perform an experimental study on the compound channel having converging floodplains with three different converging angles. Later, the consequence of convergence in an asymmetric compound channel with sharp narrowing on the floodplain with an angle of convergence of 22° was explored by Proust (2005). Bousmar et al. (2006) investigated diverging compound channels having two different diverging angles.

In general, the rate of flow in the main channel tends to be higher than in the floodplain. The interaction between the swift-moving water in the main channel and the slower-moving water in the floodplain creates significant flow resistance, making it challenging to predict discharge accurately. Due to these interactions exchange of mass and momentum is seen at the transition point and losses of momentum are observed due to this interaction. Changes in the cross-sectional area of the compound channel can result in a transition in the state of flow from being uniform to becoming non-uniform, which adds complexity to the hydraulic analysis. This transition has been addressed in the works of (Bousmar & Zech 2004; Rezaei & Knight 2009, 2011; Proust et al. 2010; Das 2018). Bousmar & Zech (2004) introduced a lateral distribution model (LDM) for uniform flow, which they subsequently adapted to account for non-uniform flow conditions. Rezaei (2006) experimented with the convergent compound channel and created an analytical model for the determination of the water surface profile. Using the first rule of thermodynamics, Proust et al. (2010) created a 1-D model to forecast the loss of energy for each subsection which are, the left floodplain, main channel and right floodplain. Yonesi et al. (2013) examined the effect of the roughness of floodplains on overbank flow in non-prismatic compound channels. Naik & Khatua (2016) used non-dimensional geometric parameters and developed a multivariate regression model to forecast the water level profile for compound channels. Das & Khatua (2018) developed two different non-linear regression equations for calculation flow resistance in converging and diverging compound channels. Das & Khatua (2019) proposed a regression model for predicting water surface profile in diverging compound channels.

It is complicated to estimate the discharge of the compound channel because of its complex geometry, complex flow patterns and the existing method requires complex calculation and provides poor results for non-prismatic compound channels. In the last thirty years, a range of machine learning (ML) algorithms have been adopted for developing the model for estimating the discharge of channels. Zahiri & Dehgahani (2009) using ANN calculated the discharge in the compound channel. Later Azamathulla & Zahiri (2012) used the GEP and M5 Tree model to successfully estimate the flow in a compound channel section. Najafzadeh & Zahiri (2015) computed the discharge in a prismatic compound channel using the neuro-fuzzy group method for data handling (NF-GMDH) and found that the method has a better predictive ability compared to the genetic programming, non-linear regression and the vertically DCM. Zahiri & Najafzadeh (2018) proposed gene expression programming (GEP), model tree (MT) and the evolutionary polynomial regression (EPR) model for predicting discharge in compound channel sections using three non-dimensional parameters, such as relative flow depth, coherence parameter and discharge computed by the vertical division method. Khuntia et al. (2018) developed an ANN model for boundary shear stress distribution in prismatic compound channels and found that the developed model produces better results than existing empirical equations. Das et al. (2018, 2020) used soft computing techniques ANFIS and GEP to estimate the rate of flow of compound channels. Shekhar et al. (2023) used hybrid ANN-PSO and MARS soft computing techniques and compared them with traditional and empirical methods.

Gradient Boosting Decision Tree (GBDT) algorithms (XgBoost, CatBoost, LightGBM) have been used in different hydrologic and hydraulic modelling (Yu et al. 2020; Demir & Sahin 2023; Eini et al. 2023). However, the applicability of these ML approaches is not explored in the domain of compound channels. So, in the present manuscript, an attempt has been made to check the usefulness of six ML models, namely Xgboost, LightGBM, CatBoost, PSO-XgBoost, PSO-LightGBM and PSO-CatBoost, in discharge prediction for compound channels having converging and diverging flood.

Discharge in a compound channel with converging and diverging flood plains depends on various flow and geometrical parameters. A diverse dataset of different parameters influencing discharge in a non-prismatic compound channel has been collected from Bousmar (2002), Rezaei (2006), Bousmar et al. (2006), Yonesi et al. (2013), Naik & Khatua (2016), Das et al. (2018), Mehrabani et al. (2020) and presented in Table 1. A detail of geometry and schematic diagram of the compound channel with converging and diverging flood plain is illustrated in Figure 1 from previous studies.
Table 1

Summary of the dataset used for modelling Q/Qmcb

AuthorsFrArRrβS0δ*ΑXrθQ/Qmcb
Bousmar (2002) Cv3.81 0.646–0.837 0.93–10.72 1.70–3.70 0.2780 –0.5380 0.0010 3.690–5.770 1.340–3.000 0.00–0.833 −3.81 1.745–2.908 
Bousmar (2002) Cv11.31 0.610–0.835 0.94–9.71 1.72–4.40 0.2050–0.5310 0.0010 3.750–6.360 1.500–3.000 0.000–0.250 −11.31 1.454–2.326 
Rezaei (2006) Cv 1.91 0.607–0.824 0.98–9.76 1.79–4.46 0.2020–0.5090 0.0020 3.910–6.360 1.510–3.020 0.000–0.750 −3.81 1.215–2.177 
Rezaei (2006) Cv 3.81 0.602–0.830 0.96–4.05 1.75–4.59 0.1790–0.5220 0.0020 3.800–6.540 2.010–3.020 0.000–1.000 −1.91 1.210–3.195 
Rezaei (2006) Cv 11.31 0.619–0.825 0.98–4.92 1.78–4.22 0.1990–0.5060 0.0020 3.930–6.380 2.010–3.020 0.667–0.833 −11.31 1.136–2.046 
Bousmar et al. (2006) Dv3.81 0.620–0.832 0.95–12.80 1.73–4.20 0.2140–0.5250 0.0010 3.800–6.290 1.330–3.000 0.167–1.000 3.81 1.745–2.908 
Bousmar et al. (2006) Dv5.71 0.638–0.821 1.39–11.38 1.81–3.85 0.2640–0.5390 0.0010 3.690–5.890 1.330–2.330 0.250–1.000 5.71 1.745–2.908 
Yonesi et al. (2013) Dv3.81 0.305–0.806 1.37–20.60 1.91–35.09 0.1450–0.3640 0.0009 1.410–1.900 1.330–3.000 0.167–1.000 3.81 8.213–12.319 
Yonesi et al. (2013) Dv11.31 0.372–0.801 1.39–11.41 1.94–19.43 0.1460–0.3590 0.0009 1.420–1.900 1.600–3.000 0.100–0.333 11.31 8.213–12.319 
Naik & Khatua (2016) Cv5 0.527–0.716 3.85–21.08 2.73–6.85 0.1180–0.3250 0.0011 3.370–4.410 1.400–1.800 0.000–0.500 −5.00 1.426–1.734 
Naik & Khatua (2016) Cv 9 0.573–0.712 3.94–15.59 2.77–5.31 0.1600–0.3190 0.0011 3.400–4.200 1.400–1.800 0.000–0.500 −9.00 1.233–1.580 
Naik & Khatua (2016) Cv 12.3 0.516–0.715 3.86–22.59 2.73–7.26 0.1110–0.3240 0.0011 3.380–4.450 1.400–1.800 0.000–0.595 −13.38 1.194–1.541 
Das et al. (2018) Dv5.93 0.481–2.608 0.404–3.327 1.341–4.386 0.1400–0.5130 0.001 1.466–2.588 2.765–5.824 0.000–1.000 5.93 1.490–5.146 
Das et al. (2018) Dv9.83 0.435–2.363 0.396–3.363 1.322–4.386 0.1314–0.2369 0.001 1.435–2.588 2.765–5.824 0.000–1.000 9.83 1.483–5.151 
Das et al. (2018) Dv14.57 0.395–2.218 0.399–3.375 1.330–4.327 0.1420–0.5190 0.001 1.447–2.582 2.765–5.824 0.000–1.000 14.57 1.472–5.064 
Maherbani et al. (2019) 0.009–0.042 1.222–1.521 2.456–2.816 0.2663–0.3103 0.001 4.138–4.402 3.167–3.333 0.000–0.500 7.25–11.30 2.068–2.291 
AuthorsFrArRrβS0δ*ΑXrθQ/Qmcb
Bousmar (2002) Cv3.81 0.646–0.837 0.93–10.72 1.70–3.70 0.2780 –0.5380 0.0010 3.690–5.770 1.340–3.000 0.00–0.833 −3.81 1.745–2.908 
Bousmar (2002) Cv11.31 0.610–0.835 0.94–9.71 1.72–4.40 0.2050–0.5310 0.0010 3.750–6.360 1.500–3.000 0.000–0.250 −11.31 1.454–2.326 
Rezaei (2006) Cv 1.91 0.607–0.824 0.98–9.76 1.79–4.46 0.2020–0.5090 0.0020 3.910–6.360 1.510–3.020 0.000–0.750 −3.81 1.215–2.177 
Rezaei (2006) Cv 3.81 0.602–0.830 0.96–4.05 1.75–4.59 0.1790–0.5220 0.0020 3.800–6.540 2.010–3.020 0.000–1.000 −1.91 1.210–3.195 
Rezaei (2006) Cv 11.31 0.619–0.825 0.98–4.92 1.78–4.22 0.1990–0.5060 0.0020 3.930–6.380 2.010–3.020 0.667–0.833 −11.31 1.136–2.046 
Bousmar et al. (2006) Dv3.81 0.620–0.832 0.95–12.80 1.73–4.20 0.2140–0.5250 0.0010 3.800–6.290 1.330–3.000 0.167–1.000 3.81 1.745–2.908 
Bousmar et al. (2006) Dv5.71 0.638–0.821 1.39–11.38 1.81–3.85 0.2640–0.5390 0.0010 3.690–5.890 1.330–2.330 0.250–1.000 5.71 1.745–2.908 
Yonesi et al. (2013) Dv3.81 0.305–0.806 1.37–20.60 1.91–35.09 0.1450–0.3640 0.0009 1.410–1.900 1.330–3.000 0.167–1.000 3.81 8.213–12.319 
Yonesi et al. (2013) Dv11.31 0.372–0.801 1.39–11.41 1.94–19.43 0.1460–0.3590 0.0009 1.420–1.900 1.600–3.000 0.100–0.333 11.31 8.213–12.319 
Naik & Khatua (2016) Cv5 0.527–0.716 3.85–21.08 2.73–6.85 0.1180–0.3250 0.0011 3.370–4.410 1.400–1.800 0.000–0.500 −5.00 1.426–1.734 
Naik & Khatua (2016) Cv 9 0.573–0.712 3.94–15.59 2.77–5.31 0.1600–0.3190 0.0011 3.400–4.200 1.400–1.800 0.000–0.500 −9.00 1.233–1.580 
Naik & Khatua (2016) Cv 12.3 0.516–0.715 3.86–22.59 2.73–7.26 0.1110–0.3240 0.0011 3.380–4.450 1.400–1.800 0.000–0.595 −13.38 1.194–1.541 
Das et al. (2018) Dv5.93 0.481–2.608 0.404–3.327 1.341–4.386 0.1400–0.5130 0.001 1.466–2.588 2.765–5.824 0.000–1.000 5.93 1.490–5.146 
Das et al. (2018) Dv9.83 0.435–2.363 0.396–3.363 1.322–4.386 0.1314–0.2369 0.001 1.435–2.588 2.765–5.824 0.000–1.000 9.83 1.483–5.151 
Das et al. (2018) Dv14.57 0.395–2.218 0.399–3.375 1.330–4.327 0.1420–0.5190 0.001 1.447–2.582 2.765–5.824 0.000–1.000 14.57 1.472–5.064 
Maherbani et al. (2019) 0.009–0.042 1.222–1.521 2.456–2.816 0.2663–0.3103 0.001 4.138–4.402 3.167–3.333 0.000–0.500 7.25–11.30 2.068–2.291 
Figure 1

Schematic diagram of non-prismatic compound channel: (a) converging floodplain and (b) diverging floodplain.

Figure 1

Schematic diagram of non-prismatic compound channel: (a) converging floodplain and (b) diverging floodplain.

Close modal

A total of 290 datasets from experiments on non-prismatic compound channels were collected from various studies. Table 1 provides a range of datasets used to build ML models, 75% of the data (218 datasets) were randomly chosen for training, while the remaining data were reserved for validation/testing. Based on insights from previous research conducted by Das et al. (2018, 2020) and Yonesi et al. (2022), a set of critical parameters have been identified as pivotal for predicting the discharge of non-prismatic compound channels. Using Buckingham's theorem, nine dimensionless parameters have been obtained to model discharge as given in the following equation.

(1)
where AR is the area ratio or the ratio of the main channel area to the floodplain, RR stands for the relative hydraulic radius, which is the ratio of the hydraulic radius of the main channel to the hydraulic radius of the floodplain, δ* stands for the flow aspect ratio, which compares the main channel width (B) to flow depth (H), α- width ratio, the width of the floodplain to the width of the main channel, β is the relative flow depth, (H-h)/H, where (h) is the main channel depth, XR is the relative longitudinal distance, the ratio of the distance (l) of the section in the longitudinal direction of the channel to the total length (L) of the non-prismatic channel, θ - converging or diverging angle of the floodplain, So is the bed slope of the channel, Ff is the friction factor ratio, is the ratio of main channel friction factor Fmc to floodplain Ffp. A total of nine non-dimensional input parameters are selected to develop the predictive model of Q/Qmcb. The selected dataset has been normalised using the equation to have all data on the same scale for improving the performance of ML models. Principal component analysis (PCA) on input datasets of compound channels with converging and diverging flood plains has been performed to identify the influential parameter based on the spatial likeness among all the input parameters for the prediction of discharge. A detailed summary of the dataset of different researchers used in the study is presented in Table 1.

Principal Component Analysis

PCA is a multivariate statistical technique commonly employed for statistical analysis, specifically in the context of dimensionality reduction through factor analysis. This method organises complex datasets into uncorrelated variables, which are linear combinations of the original variables, resulting in a significant simplification of the information contained in the original data (Pandey et al. 2023). To assess the appropriateness of the dataset for PCA, the Kaiser–Meyer–Olkin (KMO) and Bartlett's spherical tests were utilised, with criteria set at KMO > 0.5 and a significance level of p < 0.05 in Bartlett's test (Mukherjee & Singh 2022). Depending on the absolute values, the component loadings are classified as strong (>0.75), moderate (0.5–0.75) and weak (0.3–0.5) (Wang et al. 2017).

Gradient boosting decision trees

GBDT is an ensemble method that was introduced by Friedman (2001). GBDTs use a boosting mechanism to create a robust learner through the combination of numerous weak learners characterised by relatively lower accuracy This paper focuses on applying three newly introduced GBDT modifications, namely XGBoost, CatBoost and LightGBM, to develop predictive models for discharge estimation in a compound channel with converging and diverging flood plain. Subsequent sections provide descriptions and highlight the key features of these algorithms.

EXtreme Gradient Boosting

Chen & Guestrin (2016) introduced eXtreme Gradient Boosting (XGBoost), a scalable tree-boosting ML technique. The XGBoost algorithm is an advancement over the GBDT algorithm frequently used for regression tasks and encompasses numerous decision trees (DTs) to enhance predictability. In the XgBoost framework, DT acts as a foundational learner and combines them to form a strong learner. The flow chart of the algorithm is depicted in Figure 2. In GBDT, the objective function is primarily the loss function while in the XgBoost encompasses both the loss function and regularisation term. The loss function measures the disparity between the actual and predicted value and the objective function is optimised to minimise this disparity (Wu et al. 2022). The regularisation term is introduced to manage model complexity and extreme cases. XGBoost is extensively utilised across various industries due to its ability to offer parallel computing capabilities and significantly enhance algorithm accuracy.
Figure 2

Flowchart describing the XGBoost algorithm.

Figure 2

Flowchart describing the XGBoost algorithm.

Close modal
Given a data set D={(xi, yi) | xiɛRm, yiɛRm, i=1,2………, n}, the number of samples is n, m is the number of features and the predicted value is the cumulative sum of outputs of all the k-DTs based on the inputs xi. The XgBoost model employs an additive approach, as described in Equation (2).
(2)
The objective function is the sum of the loss function and regular term and is expressed as Equation (3).
(3)
where signifies the loss function, which measures the disparity between actual and predicted value, and signifies the regular term, which is used to manage overfitting and stated, as shown in Equation (4).
(4)
where T denotes the number of leaf nodes, parameters γ and λ play an important role in controlling the structure of the tree and distribution of weight among leaf nodes and wj represents the weight associated with the leaf node indexed as j. The overall objective function is converted into the following form as Equation (5) using Equations (2) and (3).
(5)
Expanding the loss function a new objective function is obtained using the second-order Taylor expansion, the new loss function shown in Equation (6).
(6)
wheere and are the second derivative of the loss function.
The optimal weight of the leaf node is described by Equation (7)
(7)
where is the sum of the accumulated values of the first-order partial derivatives contained in the leaf node j, and is the sum of the second-order partial derivatives contained in the leaf node j.
The objective function is rewritten as a quadratic function of one variable for the leaf weight w and the optimal value of the objective function in Equation (8):
(8)

Categorical boosting (CatBoost)

CatBoost is a modern gradient boosting algorithm, introduced by Prokhorenkova et al. (2018) and Dorogush et al. (2018). It excels in handling categorical features with minimum information loss. CatBoost incorporates several distinctive techniques that set it apart from other gradient boosting algorithms. Firstly, CatBoost utilises ‘ordered boosting,’ which is an efficient variation of gradient boosting that addresses the issue of target leakage. Secondly, CatBoost is particularly effective on small datasets, making it suitable for situations where data availability is limited. Thirdly, one of the standout features of CatBoost is its ability to handle categorical features. This handling usually occurs during preprocessing, where original categorical variables are transformed into one or more numerical representations. Specifically, for each category, a new binary feature is introduced. CatBoost performs a random permutation of the dataset and computes an average label value, for example, with the same category value placed before the given one in the permutation. Let be permutation, then
(9)
where P is a prior value and is the weight of the prior. For regression tasks, the standard technique for calculating prior is to take the average label value in the dataset.

Light Gradient Boosting Machine

Ke et al. (2017) introduced the Light Gradient Boosting Machine (LightGBM) technique, which gained popularity due to its efficiency and accuracy in gradient boosting. It employs various techniques to enhance its computational power and predictive performance. One notable feature is the histogram algorithm, which discretises continuous feature values into integers and constructs histograms to guide the decision tree-building process. Additionally, LightGBM utilises a leaf-wise algorithm that selects the leaf with the highest splitting gain among existing leaves for further division. To mitigate the risk of overfitting, LightGBM incorporates a maximum depth constraint on this leaf-wise approach, ensuring efficiency and robustness in model training. Overall, LightGBM's combination of innovative algorithms and controls makes it a widely recognised and efficient gradient boosting model in the ML community.

Particle swarm optimisation

The Particle Swarm Optimisation (PSO) algorithm is an intelligent search technique inspired by the foraging behaviour of birds introduced by Kennedy & Eberhart (1995). The model procedure starts with the initialisation of the random initial position and velocity of the particles. Each particle maintains two parameters representing its speed and position. During each iteration, particles track their individual optimal positions (Pbest) and the global optimal position for the entire group (gbest) to update themselves. The iteration continues until the global optimal solution is found or a maximum number of iterations is reached. To find an optimal solution, particles adjust their speed and position based on specific Equations (10) and (11).
(10)
(11)
where X and V are the current position and associated velocity of each particle, respectively. C1 and C2 are positive acceleration constants selected from operators to optimise the network output. and are new position and velocity of particles, respectively. is the weight and and are the random number ranges between (0,1). A flowchart describing the PSO algorithm has been shown in Figure 3.
Figure 3

Flow chart illustrating the hybrid PSO algorithm.

Figure 3

Flow chart illustrating the hybrid PSO algorithm.

Close modal

Hyperparameters optimisations

Hyperparameter tuning is of paramount significance in ML algorithms as it governs the behaviour of training algorithms and plays an important role in improving model performance. In this paper, the PSO approach is applied for fine-tuning hyperparameters of XGBoost, LightGBM and CatBoost ML algorithms. The PSO algorithm relies on various controlling parameters, including fitness criteria (such as RMSE, MSE and R2), local coefficient (c1), global coefficient (c2), inertia coefficient (w), maximum iterations and population/swarm size (s). Table 2 provides a detailed overview of the PSO control parameters, the hyperparameters subjected to tuning, explanations of these parameters and the specified value ranges for optimising each ML model.

Table 2

Description of hyperparameters of XgBoost, LightGBM and CatBoost

ML modelsHyperparametersDefault valueDescriptionRangePSO parameter
XGBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration:
100 
n_estimators 100 The number of boosting rounds or trees. 100–1,000 
learning_rate 0.6 The step size to shrink the contribution of each tree 0.01–0.3 
max_depth Maximum depth of tree 1–6 
max_delta_step To control the step size when updating the leaf values during training 0–1 
min_child_weight The minimum sum of weight of all observations  
LIGHTBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration
:100 
num_leaves 31 Maximum number of leaves in one tree 20–1,000 
learning_rate 0.05 Learning rate to control the step size during boosting 0.01–0.3 
feature_fraction 0.9 Subsample ratio of features during training 0.5–1.0 
bagging_fraction 0.8 Subsample ratio of data points during training 0.5–1.0 
bagging_freq Frequency for bagging 
CATBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration:
100 
Iterations 100 Number of boosting iterations 100–1,000 
Depth Tree depth 4–10 
Learning_rate 0.1 Boosting learning rate 1.01–0.3 
ML modelsHyperparametersDefault valueDescriptionRangePSO parameter
XGBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration:
100 
n_estimators 100 The number of boosting rounds or trees. 100–1,000 
learning_rate 0.6 The step size to shrink the contribution of each tree 0.01–0.3 
max_depth Maximum depth of tree 1–6 
max_delta_step To control the step size when updating the leaf values during training 0–1 
min_child_weight The minimum sum of weight of all observations  
LIGHTBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration
:100 
num_leaves 31 Maximum number of leaves in one tree 20–1,000 
learning_rate 0.05 Learning rate to control the step size during boosting 0.01–0.3 
feature_fraction 0.9 Subsample ratio of features during training 0.5–1.0 
bagging_fraction 0.8 Subsample ratio of data points during training 0.5–1.0 
bagging_freq Frequency for bagging 
CATBOOST Fitness criteria: RMSE
c1: 2.7
c2: 1.3
w: 0.8
max_iteration:
100 
Iterations 100 Number of boosting iterations 100–1,000 
Depth Tree depth 4–10 
Learning_rate 0.1 Boosting learning rate 1.01–0.3 

Performance evaluation parameters

Various statistical parameters were employed for quantitative comparisons to assess the performance of ML models (Najafzadeh & Oliveto 2020; Najafzadeh & Anvari 2023). The performance evaluation parameters considered in the study are given below.

Scatter Index

The Scatter Index (SI) measures the spread of predicted values compared to the spread of true values calculated using Equation (12). Lower SI indicates better model performance in capturing the variability of the true values.
(12)
where Oi = observed Pi = Predicted , = mean of observed and = mean of predicted

BIAS

BIAS measures the average difference between predicted and true values estimated as Equation (13). BIAS close to zero indicates good predictions; positive or negative values suggest overestimation or underestimation, respectively.
(13)

Discrimination index

The discrimination index (DI) evaluates how well the predicted and true values vary together indicated by Equation (14). DI close to 1 indicates a good match between the variations in predicted and true values.
(14)

Coefficient of determination (R2)

It represents the proportion of the variance in the true values explained by the predicted values. Higher R2 indicates better explanatory power and is calculated as Equation (15).
(15)

Mean squared error

It calculates the average squared difference between predicted and true values and is estimated using the formula given in Equation (16). Lower mean squared error (MSE) values indicate better accuracy of the model.
(16)

RMSE

It measures the average deviation between actual and predicted values. It is calculated as the square root of MSE as given in Equation (17).
(17)

Uncertainty, reliability and resilience analysis

The significance of uncertainty, reliability and resilience analyses in evaluating model performance is paramount for ensuring the credibility and utility of predictive models (Saberi-Movahed et al. 2020). Uncertainty analysis aims to define a reliable uncertainty interval, denoted as U95, indicating the range within which the true outcome of an experiment is likely to fall. U95 is estimated based on errors in the experimental measurement process, with the understanding that in around 95 out of 100 trials, the true outcome would lie within this interval. The U95 formula involves the weighted summation of squared differences between observed and predicted values. Reliability analysis assesses a model's overall consistency, expressed as a percentage computed through the relative average error (RAE). The reliability factor is set to 1 if RAE is less than or equal to a threshold (typically 20%) and the reliability of the model is determined as the average of these factors. Resilience analysis, related to reliability, is expressed as a percentage and evaluates a model's ability to recover from inaccurate predictions. Collectively, these analyses contribute to a comprehensive understanding of model behaviour, empowering decision-makers to make more informed choices and enhancing the overall robustness and trustworthiness of predictive models in various domains.

SHapley Additive Explanation

SHAP is a feature attribution method rooted in game theory. It addresses the challenge of the inherent black-box characteristics of certain ML models by introducing a reliable interpretability framework (Lundberg & Lee 2017; Chang et al. 2022). SHAP employs Shapley values as a means to quantitatively express the individual contributions of input features to model output, thus providing transparency and understanding of model behaviour. In this study, SHAP analysis has been performed on the best models (PSO-CatBoost and PSO-XgBoost) to find the contribution of each input feature for predicting discharge in a compound channel having converging and diverging flood plain.

Sensitivity analysis

Sensitivity analysis systematically evaluates the impact of individual features on the model predictions by iteratively isolating each feature (Najafzadeh et al. 2015; Najafzadeh et al. 2016). For each feature, it is temporarily removed from the dataset and replaced with either a constant or the mean of the feature. Subsequently, the model predicts the target variable with the altered dataset and the absolute change in the predictions is quantified as the sensitivity score for that particular feature. This process allows for a comprehensive exploration of how alterations in individual features influence the overall model predictions. Sensitivity analysis is essential for assessing the robustness and reliability of a model by examining the response of the model outputs to variations in input variables, providing valuable insights into the model behaviour under different conditions. Figure 4 describes the detailed methodology adopted for discharge prediction in a compound channel with converging and diverging flood plains in this study.
Figure 4

Flowchart describing the detailed methodology of the study.

Figure 4

Flowchart describing the detailed methodology of the study.

Close modal

Recent years have witnessed substantial advancement in the practical application of the ML algorithm for regression tasks. The traditional ML framework involves crucial stages such as data preprocessing, model selection, development, evaluation and deployment. Hyperparameter optimisation is an essential element of the ML algorithm and effective optimisation of these hyperparameters is crucial for enhancing ML algorithm performance. In this study, XGBoost, LightGBM and CatBoost were utilised for predicting discharge in a compound channel and PSO was used to fine-tune their hyperparameters, aiming to boost their effectiveness in the task. Subsequently, three hybrid models, PSO-XGBoost, PSO-LightGBM and PSO-CatBoost were developed, incorporating PSO for hyperparameter optimisation. These models were evaluated by comparing their performance to models trained with default hyperparameters, providing insights into the efficacy of PSO in optimising the algorithms. This section incorporates a detailed discussion of results obtained from the present models and existing models for the estimation of discharge in a compound channel of converging and diverging flood plain.

To determine the most contributing parameter for the estimation of discharge in a compound channel having a converging and diverging flood plain, PCA was performed. Prior to performing the PCA, Bartlett's test and KMO test were performed on all the parameters to assess its aptness for the PCA analysis. The significance value of Bartlett's test on the input dataset was close to zero and the KMO test was 0.549 showing the appropriateness of the dataset for PCA analysis. PCA extracts the essential parameter for representing information within the whole dataset, four principal components (PC1, PC2, PC3 and PC4) reflecting the total variance of 78.16%, as presented in Table 3. PC 1 explains 31.07% of the variance within the variables. Area ratio (AR) and width ratio (α) are loaded above 0.75 which is strong loading (highlighted in bold, Table 3), flow aspect ratio (δ*) and converging or diverging angle (θ) exhibit moderate loading and other parameters have weak loading. PC2, PC3 and PC4 explain variance variables of 19.72,15.001 and 1.112%, respectively. Relative longitudinal distance (XR) and bed slope of the channel (S0) exhibit strong loading for PC 3 and PC 4, respectively. PCA analysis suggests that the variables Ff, S0, δ*, α, XR and θ play a significant role in explaining the variance in the dataset as they have high loadings on the primary principal component (PC 1). However, in this study, all the input parameters are selected for modelling discharge as the cumulative variability of all the principal components is 78.16% to have a more efficient model.

Table 3

PCA results of input parameters, eigenvalues and interpreted variances

PC 1PC 2PC 3PC 4
Ff 0.309 0.123 −0.802 0.131 
AR −0.791 0.428 0.107 0.066 
RR −0.439 0.724 0.157 0.051 
Β 0.022 −0.734 0.008 −0.498 
S0 0.309 −0.264 −0.068 0.809 
δ* −0.622 −0.469 0.156 0.353 
α 0.869 0.170 0.213 0.125 
XR 0.383 −0.078 0.771 0.101 
θ 0.706 0.434 0.054 −0.187 
Eigenvalue 2.796 1.776 1.350 1.112 
Total Variance % 31.071 19.729 15.001 12.361 
Cumulative % 31.071 50.799 65.800 65.800 
PC 1PC 2PC 3PC 4
Ff 0.309 0.123 −0.802 0.131 
AR −0.791 0.428 0.107 0.066 
RR −0.439 0.724 0.157 0.051 
Β 0.022 −0.734 0.008 −0.498 
S0 0.309 −0.264 −0.068 0.809 
δ* −0.622 −0.469 0.156 0.353 
α 0.869 0.170 0.213 0.125 
XR 0.383 −0.078 0.771 0.101 
θ 0.706 0.434 0.054 −0.187 
Eigenvalue 2.796 1.776 1.350 1.112 
Total Variance % 31.071 19.729 15.001 12.361 
Cumulative % 31.071 50.799 65.800 65.800 

Evaluating performance of ML models

Performance evaluation of ML models is critical for assessing the prediction capability of the models. In the present study, ML algorithms (XgBoost, CatBoost and LightGBM) with a default value of hyperparameters have been developed for the prediction of discharge. The PSO algorithm was used for the optimisation of hyperparameters and developed hybrid models, namely PSO-XgBoost, PSO-CatBoost and PSO-LightGBM. All the base models were compared with their hybrid model counterparts to assess the effectiveness of optimisation using PSO. Different statistical performance parameters, such as RMSE, R2 and MAPE, have been used for assessing the comparative effectiveness of models.

Performance evaluation of XgBoost and PSO-XgBoost

The XgBoost model has been developed using five important and effective hyperparameters for the learning processes, namely n_estimators, learning_rate, max_depth, max_delta_step and min_child_weight. Default values of these hyperparameters have been taken for the development of the XgBoost model, as given in Table 4. The PSO algorithm was used for fine-tuning the hyperparameters of the XgBoost. The best hyperparameters have been chosen by performing 100 iterations under different hyperparameter guesses and the best hyperparameters, are presented in Table 4.

Table 4

XGBoost defaults hyperparameters and optimised by PSO

Hyper parametersn_estimatorslearning_ratemax_depth max_delta_stepmin_child_weight
Default 100 0.6 
PSO 50 0.1323 27 
Hyper parametersn_estimatorslearning_ratemax_depth max_delta_stepmin_child_weight
Default 100 0.6 
PSO 50 0.1323 27 

Traditionally, hyperparameter settings are either manually defined by the user or left at their default values without any fine-tuning. However, parameter optimisation has a significant influence on the performance of the ML model. Therefore, this study conducted an objective comparison between results obtained from default parameters and the hyperparameters optimised using PSO. The regression analysis plot for training and testing of XgBoost is presented in Figure 5(a) and 5(b) and the PSO-XgBoost model is shown in Figure 6(a) and 6(b). Based on the statistical performance parameters given in Table 7 it is evident that the performance of the XgBoost model is enhanced by applying PSO optimising parameters. There is a more significant increase in the value of R2 from 0.989–to 0.993 in the training stage than in the testing stage from 0.984 to 0.989 by the application of PSO. All the statistical indices representing errors are reduced by the application of the hybrid PSO-XgBoost model.
Figure 5

Regression plot for (a) training and (b) testing of theXgBoost model.

Figure 5

Regression plot for (a) training and (b) testing of theXgBoost model.

Close modal
Figure 6

Regression plot for (a) training and (b) testing of the PSO-XgBoost model.

Figure 6

Regression plot for (a) training and (b) testing of the PSO-XgBoost model.

Close modal

Performance evaluation of LightGBM and PSO-LightGBM

The LightGBM model was constructed using the default value of hyperparameters. The hyperparameters selected for the modelling were num_leaves, learning_rate, feature_fraction, bagging_freq and bagging_freq. These hyperparameters were optimised using the PSO algorithm. The default value of and optimised best hyperparameters are given in Table 5. The regression analysis plot for training and testing of LightGBM is presented in Figure 7(a) and 7(b) and the PSO-LightGBM model is presented in Figure 8(a) and 8(b). Statistical parameters are summarised in Table 6 for assessing the performances of both models. The value of R2 increases in both the training and testing stages after the implication of PSO. However, significant improvement is observed in the value of R2 in the testing stage from 0.85 to 0.90. Performance indices of errors were reduced in the case of PSO-LightGBM, which is evident from Table 7.
Table 5

LightGBM defaults hyperparameters and optimised by PSO

Hyper parametersnum_leaveslearning_ratefeature_fractionbagging_fractionbagging_freq
Default 31 0.05 0.9 0.8 
PSO 813 0.0427 0.9 0.8 
Hyper parametersnum_leaveslearning_ratefeature_fractionbagging_fractionbagging_freq
Default 31 0.05 0.9 0.8 
PSO 813 0.0427 0.9 0.8 
Table 6

Catboost defaults hyperparameters and optimised by PSO

HyperparametersIterationsDepthLearning_rate
Default 100 0.1 
PSO 194 0.232932 
HyperparametersIterationsDepthLearning_rate
Default 100 0.1 
PSO 194 0.232932 
Table 7

Comparative performances of all ML models

ModelsRMSER2MAPE
XgBoost Training 0.0182 0.989 0.0981 
Testing 0.0244 0.984 0.1513 
All 0.0200 0.988 0.1115 
PSO-XgBoost Training 0.0144 0.993 0.0674 
Testing 0.0203 0.989 0.1142 
All 0.0161 0.992 0.0792 
LightGBM Training 0.0555 0.903 0.1455 
Testing 0.0736 0.857 0.1931 
All 0.0605 0.889 0.1575 
PSO- LightGBM Training 0.0523 0.913 0.1393 
Testing 0.0601 0.904 0.1675 
All 0.0544 0.911 0.1464 
CatBoost Training 0.0159 0.992 0.0816 
Testing 0.0279 0.979 0.1201 
All 0.0196 0.988 0.0913 
PSO-CatBoost Training 0.0125 0.995 0.0683 
Testing 0.0211 0.988 0.1261 
All 0.0151 0.993 0.0828 
ModelsRMSER2MAPE
XgBoost Training 0.0182 0.989 0.0981 
Testing 0.0244 0.984 0.1513 
All 0.0200 0.988 0.1115 
PSO-XgBoost Training 0.0144 0.993 0.0674 
Testing 0.0203 0.989 0.1142 
All 0.0161 0.992 0.0792 
LightGBM Training 0.0555 0.903 0.1455 
Testing 0.0736 0.857 0.1931 
All 0.0605 0.889 0.1575 
PSO- LightGBM Training 0.0523 0.913 0.1393 
Testing 0.0601 0.904 0.1675 
All 0.0544 0.911 0.1464 
CatBoost Training 0.0159 0.992 0.0816 
Testing 0.0279 0.979 0.1201 
All 0.0196 0.988 0.0913 
PSO-CatBoost Training 0.0125 0.995 0.0683 
Testing 0.0211 0.988 0.1261 
All 0.0151 0.993 0.0828 
Figure 7

Regression plot for (a) training and (b) testing for the LightGBM model.

Figure 7

Regression plot for (a) training and (b) testing for the LightGBM model.

Close modal
Figure 8

Regression plot for (a) training and (b) testing for the PSO- LightGBM model.

Figure 8

Regression plot for (a) training and (b) testing for the PSO- LightGBM model.

Close modal

Performance evaluation of CatBoost and PSO-CatBoost

The CatBoost model is developed with the default value of the most important and effective hyperparameters. The building process of the PSO-CatBoost model was initiated by searching best hyperparameters using the PSO approach. The optimised values of the hyperparameters obtained from the application of PSO are presented in Table 6. The regression analysis plot for training and testing of the CatBoost model is presented in Figure 9(a) and 9(b) and the PSO-CatBoost model is presented in Figure 10(a) and 10(b). Table 7 provides the performance indices for assessing the comparative performances of the models. There is no significant improvement in value R2 in the training stage but in the testing stage, it increases from 0.979 to 0.988. All the error values are reduced after the implementation of PSO algorithms, which is evident from Table 6.
Figure 9

Regression plot for (a) training and (b) testing for the CatBoost model.

Figure 9

Regression plot for (a) training and (b) testing for the CatBoost model.

Close modal
Figure 10

Regression plot for (a) training and (b) testing for the PSO-CatBoost model.

Figure 10

Regression plot for (a) training and (b) testing for the PSO-CatBoost model.

Close modal

Comparative evaluation of all the ML models

In order to check the efficacy of hyperparameter optimisation using PSO, it is compared with the default value of the model in the previous section. The modified algorithm finetunes the hyperparameters of XgBoost, CatBoost and LightGBM models to minimise the objective function (i.e. RMSE for the present case). The default hyperparameters of XgBoost, CatBoost, LightGBM and optimised hyperparameters using PSO have been given in Tables 46, respectively. PSO optimisation enhances the performance of the models, as observed in Table 7. All models developed in the study using ML approaches perform reasonably well for predicting discharge in the compound channel with converging and diverging flood plains. XgBoost and CatBoost model performance are excellent and further enhanced by the application of the PSO optimising algorithm. The optimised hyperparameters of the models capture the relation between input parameters and discharge more accurately for the prediction of discharge in a compound channel having converging and diverging floodplains. The best-performing model based on R2 value in the testing phase is PSO-XgBoost while PSO-CatBoost is in the training phase (highlighted in bold, Table 7). However, the error indices, such as RMSE and MAPE values, are lower for the PSO-CatBoost model in the training stage while in the testing stage, it is lower for the PSO-XgBoost model. The PSO-CatBoost model performs better than other ML models used in the study for all data, so it is considered a better model for the prediction of discharge in a compound channel.

Comparison of present models with existing empirical equations

Percentage Qmc has been calculated using four empirical equations developed by Knight & Demetriou (1983), Khatua & Patra (2007), Devi et al. (2016) and Das et al. (2022). Knight & Demetriou (1983) developed an equation for calculating %Qmc given in Equation (18).
(18)
Khatua & Patra (2007) provided Equation (19) for the computation of %Qmc.
(19)
Devi et al. (2016) formulated Equation (20) through regression analysis to calculate %Qmc.
(20)
Das et al. (2022) performed a regression analysis and formulated Equation (21) for determining %Qmc.
(21)
The present models using ML approaches were developed for predicting Q/Qmcb. The Q/Qmcb values obtained by ML models were converted to % Qmc using Equation (22).
(22)

The philosophy behind using these four empirical equations is to assess the applicability of the existing methods for the estimation of discharge in the compound channel with converging and diverging flood plains. These empirical equations consider the width ratio and aspect ratio as the input parameters to compute %Qmc. Though these methods are used to calculate the discharge in a prismatic compound channel section, the efficacy of these models is not tested for non-prismatic compound channels. So, for comparison purposes, the aforementioned methods are used. The present study highlighted that the existing empirical equations are not accurate for estimating discharge in a non-prismatic compound channel with a diverse range of datasets used in this study. So, there is a requirement of the other models, for accurate prediction of discharge in compound channels with converging and diverging flood plains.

Discharge predicted by present models and empirical model has been depicted in Figure 11 using a scatter plot. The scatter plot shows that ML models developed in the present study are closer to the best-fit line than empirical models. PSO-LightGBM underestimates the discharge in the range of 20 to 30%. The percentage discharge predicted by PSO-Catboost, PSO-XgBoost and Caboost models is mostly close to the best-fit line. Khatua & Patra (2007), Devi et al. (2016) and Das et al. (2022) mostly overestimated the discharge for all ranges of discharge. Knight & Demetriou (1983) overestimated the discharge value up to 50% discharge while underestimating the discharge above 50%.
Figure 11

Scatter plot for %Qmcb for present models and existing models.

Figure 11

Scatter plot for %Qmcb for present models and existing models.

Close modal
The violin plot reveals the much clearer distinction between different existing empirical models and ML models developed in this study for predicting discharge in a compound channel with converging and diverging flood plains. The model developed using the ML approach in the present study has less error compared to other existing models for the determination of , which is evident from the perusal of the violin plot in Figure 12. Besides that, a violin plot also provides additional information about the distribution of the error of different models. Discharge estimation by present models has a similar distribution and median value as observed data ti empirical models. The median of predicted discharge by all ML models is closer and lower than observed discharge while the median of discharge obtained from existing empirical models is higher and far away from observed discharge. However, the median of discharge obtained by Khatua & Patra (2007) and Das et al. (2022) is closer than Knight & Demetriou (1983), Devi et al. (2016). All ML models developed in the study have similar distribution and median value of discharge compared to observed discharge, but from visualisation inspection, it seems that discharge predicted from PSO-XgBoost is more aligned with observed discharge.
Figure 12

Violin plot illustrating present models and existing models.

Figure 12

Violin plot illustrating present models and existing models.

Close modal

The performance of all ten models i.e., present models Xgboost, LightGBM, CatBoost, PSO-XgBoost, PSO-LightGBM, PSO-CatBoost and four empirical methods, i.e., Knight & Demetriou (1983), Khatua & Patra (2007), Devi et al. (2016) and Das et al. (2022) for different ranges of width ratio and relative flow depth are shown in Tables 8 and 9. This assessment clearly specifies the appropriateness of different models for different input ranges of width ratio and relative flow depths. All present models developed in the study perform well for all ranges of width ratio (α) and relative flow depth (β). For the width ratio (α) <2, the PSO-XgBoost model has lower values of RMSE and MAPE while for width ratio (α) > 2, the PSO-CatBoost model has lower values of RMSE and MAPE. Out of all the models used in the study, PSO-CatBoost performs better for the width ratio (α) > 2 while PSO-XgBoost performs better for the width ratio (α) < 2. It is evident from in Table 8 that the PSO-CatBoost model has minimum values of RMSE and MAPE for all ranges of relative flow depth (β). NSE and Index of agreement (Id) values for both PSO-CatBoost and PSO-XgBoost are similar and close to 1 for all the range of width ratio (α). NSE is negative for all the empirical methods all ranges of width ratio and relative flow depth, which shows that predicted discharge by empirical methods is unsatisfactory as it suggests that the observed mean is a better predictor than the model. The NSE value of all the developed ML models in the present study is close to 1 for all ranges of width ratio and relative flow depth, which suggests that developed models have more prediction capability. However, LightGBM has lower values of NSE ranging from 0.97 to 0.65 for different ranges of width ratio and relative flow depth, but still better than the empirical models. Despite LightGBM exhibiting a lower NSE value, which ranges from 0.97 to 0.65 across various width ratios and relative flow depths, it still outperforms the empirical models. The index of agreement (Id) values is close to one across different ranges of width ratios and relative flow depths, which suggests a perfect match. The performance of empirical models is relatively acceptable, indicated by the index of agreement (Id) value, particularly for width ratios less than 1.5. However, for other ranges of width ratio and relative flow depth, the empirical models exhibit lower Index of Agreement values. ML models developed in the present study have also outperformed M5, GEP and EPR models developed in the study of Zahiri & Najafzadeh (2018) for the estimation of discharge in compound channels with converging and diverging flood plains. Range-wise error analysis for width ratio (α) and relative flow depth (β) given in Tables 8 and 9 shows that M5, GEP and EPR models developed in the study of Zahiri & Najafzadeh (2018) have good MAPE values compared to other empirical performance but other statistical performance parameters such as RMSE, Id and NSE have poor values. The unsatisfactory performance of empirical methods can be attributed to two primary factors. Firstly, data collected from different researchers often lack consistency and can introduce variability and bias and secondly, empirical equations may have limited validity, particularly when applied outside the range for which they were originally derived, leading to potential inaccuracies.

Table 8

Performance of models with different ranges of width ratio α

Modelsα < 1.51.5 < α< 2.02.0 < α< 2.52.5 < α< 3.03.0 < α< 5.8
LightGBM 0.418 1.094 0.766 0.771 0.639 
0.096 0.161 0.125 0.115 0.12 
0.875 0.756 0.856 0.872 0.879 
0.964 0.952 0.972 0.975 0.975 
CatBoost 0.105 0.326 0.265 0.244 0.307 
0.034 0.072 0.098 0.081 0.085 
0.989 0.985 0.989 0.992 0.98 
0.997 0.996 0.997 0.998 0.995 
XGBoost 0.158 0.268 0.309 0.261 0.314 
0.05 0.094 0.118 0.082 0.087 
0.977 0.991 0.985 0.99 0.98 
0.994 0.998 0.996 0.998 0.995 
PSO-LightGBM 0.402 0.922 0.727 0.715 0.622 
0.077 0.148 0.132 0.112 0.126 
0.89 0.845 0.88 0.897 0.886 
0.967 0.968 0.975 0.979 0.977 
PSO-XGBoost 0.086 0.183 0.247 0.227 0.317 
0.023 0.061 0.089 0.07 0.09 
0.993 0.996 0.99 0.993 0.98 
0.998 0.999 0.998 0.998 0.995 
PSO-CatBoost 0.097 0.198 0.232 0.206 0.265 
0.033 0.068 0.086 0.068 0.073 
0.991 0.995 0.992 0.994 0.986 
0.998 0.999 0.998 0.999 0.996 
Knight & Demetriou (1983)  19.097 28.911 31.868 36.884 44.552 
0.425 1.102 1.149 1.321 1.362 
−0.717 −5.197 −12.039 −21.122 −223.82 
0.705 0.415 0.33 0.285 0.111 
Khatua & Patra (2007)  16.099 28.256 32.479 37.561 45.811 
0.368 1.093 1.17 1.342 1.398 
−0.883 −7.8 −17.063 −26.303 −238.26 
0.72 0.375 0.292 0.259 0.106 
Devi et al. (2016)  18.325 28.038 29.293 32.027 35.126 
0.375 1.032 1.025 1.127 1.048 
−0.399 −10.292 −18.921 −30.04 −39.531 
0.635 0.429 0.301 0.303 0.157 
Zahiri & Najafzadeh (2018) _ M5 73.341 98.747 41.817 43.938 39.424 
1.750 2.090 1.446 1.502 1.200 
−3.665 −0.473 −2.952 −4.068 −9.063 
0.293 0.350 0.201 0.314 0.322 
Zahiri & Najafzadeh (2018) _ EPR 237.160 82.611 68.609 161.837 43.644 
2.690 2.239 2.185 2.547 0.838 
−0.207 −0.697 −0.786 −0.113 −0.453 
0.292 0.352 0.297 0.215 0.438 
Zahiri & Najafzadeh (2018) _ GEP 113.290 118.298 27.256 29.513 37.604 
0.723 1.031 0.601 0.601 0.581 
0.032 0.002 −8.771 −1.385 −0.425 
0.122 0.091 0.279 0.424 0.449 
Das et al. (2022)  19.571 31.585 36.02 41.168 42.633 
0.459 1.218 1.291 1.455 1.233 
−1.696 −10.151 −21.8 −33.124 −4.628 
0.654 0.349 0.266 0.238 0.377 
Modelsα < 1.51.5 < α< 2.02.0 < α< 2.52.5 < α< 3.03.0 < α< 5.8
LightGBM 0.418 1.094 0.766 0.771 0.639 
0.096 0.161 0.125 0.115 0.12 
0.875 0.756 0.856 0.872 0.879 
0.964 0.952 0.972 0.975 0.975 
CatBoost 0.105 0.326 0.265 0.244 0.307 
0.034 0.072 0.098 0.081 0.085 
0.989 0.985 0.989 0.992 0.98 
0.997 0.996 0.997 0.998 0.995 
XGBoost 0.158 0.268 0.309 0.261 0.314 
0.05 0.094 0.118 0.082 0.087 
0.977 0.991 0.985 0.99 0.98 
0.994 0.998 0.996 0.998 0.995 
PSO-LightGBM 0.402 0.922 0.727 0.715 0.622 
0.077 0.148 0.132 0.112 0.126 
0.89 0.845 0.88 0.897 0.886 
0.967 0.968 0.975 0.979 0.977 
PSO-XGBoost 0.086 0.183 0.247 0.227 0.317 
0.023 0.061 0.089 0.07 0.09 
0.993 0.996 0.99 0.993 0.98 
0.998 0.999 0.998 0.998 0.995 
PSO-CatBoost 0.097 0.198 0.232 0.206 0.265 
0.033 0.068 0.086 0.068 0.073 
0.991 0.995 0.992 0.994 0.986 
0.998 0.999 0.998 0.999 0.996 
Knight & Demetriou (1983)  19.097 28.911 31.868 36.884 44.552 
0.425 1.102 1.149 1.321 1.362 
−0.717 −5.197 −12.039 −21.122 −223.82 
0.705 0.415 0.33 0.285 0.111 
Khatua & Patra (2007)  16.099 28.256 32.479 37.561 45.811 
0.368 1.093 1.17 1.342 1.398 
−0.883 −7.8 −17.063 −26.303 −238.26 
0.72 0.375 0.292 0.259 0.106 
Devi et al. (2016)  18.325 28.038 29.293 32.027 35.126 
0.375 1.032 1.025 1.127 1.048 
−0.399 −10.292 −18.921 −30.04 −39.531 
0.635 0.429 0.301 0.303 0.157 
Zahiri & Najafzadeh (2018) _ M5 73.341 98.747 41.817 43.938 39.424 
1.750 2.090 1.446 1.502 1.200 
−3.665 −0.473 −2.952 −4.068 −9.063 
0.293 0.350 0.201 0.314 0.322 
Zahiri & Najafzadeh (2018) _ EPR 237.160 82.611 68.609 161.837 43.644 
2.690 2.239 2.185 2.547 0.838 
−0.207 −0.697 −0.786 −0.113 −0.453 
0.292 0.352 0.297 0.215 0.438 
Zahiri & Najafzadeh (2018) _ GEP 113.290 118.298 27.256 29.513 37.604 
0.723 1.031 0.601 0.601 0.581 
0.032 0.002 −8.771 −1.385 −0.425 
0.122 0.091 0.279 0.424 0.449 
Das et al. (2022)  19.571 31.585 36.02 41.168 42.633 
0.459 1.218 1.291 1.455 1.233 
−1.696 −10.151 −21.8 −33.124 −4.628 
0.654 0.349 0.266 0.238 0.377 

Note: The four values presented in each cell represent RMSE, MAPE, NSE and Id.

Table 9

Performance of models with different ranges of relative flow depth β

Models0.1 < β < 0.20.2 < β < 0.30.3 < β < 0.40.4< β < 0.5
LightGBM 0.363 0.244 1.26 0.671 
0.078 0.069 0.156 0.177 
0.978 0.896 0.716 0.651 
0.994 0.972 0.953 0.878 
CatBoost 0.176 0.148 0.329 0.242 
0.038 0.048 0.092 0.067 
0.994 0.955 0.99 0.896 
0.999 0.989 0.998 0.975 
XGBoost 0.121 0.184 0.306 0.245 
0.05 0.067 0.105 0.064 
0.997 0.931 0.992 0.908 
0.999 0.983 0.998 0.976 
PSO-LightGBM 0.121 0.184 0.306 0.245 
0.05 0.067 0.105 0.064 
0.997 0.931 0.992 0.908 
0.999 0.983 0.998 0.976 
PSO-XGBoost 0.326 0.332 1.067 0.735 
0.075 0.067 0.138 0.156 
0.982 0.858 0.823 0.582 
0.995 0.956 0.968 0.839 
PSO-CatBoost 0.051 0.143 0.253 0.204 
0.021 0.039 0.087 0.046 
0.999 0.957 0.995 0.928 
0.999 0.99 0.999 0.982 
Knight & Demetriou (1983)  37.053 24.763 33.912 25.752 
1.326 0.438 1.469 0.596 
−14.987 −3.284 −6.536 −1.291 
0.256 0.518 0.366 0.57 
Khatua & Patra (2007)  34.859 24.076 34.283 27.248 
1.238 0.414 1.482 0.639 
−6.34 −2.216 −6.315 −1.485 
0.318 0.553 0.366 0.556 
Devi et al. (2016)  26.929 19.632 30.023 28.016 
0.911 0.334 1.286 0.695 
−1.268 −0.907 −4.854 −1.962 
0.476 0.616 0.434 0.55 
Zahiri & Najafzadeh (2018) _ M5 98.66 65.28 52.43 60.37 
2.0992 1.0051 2.1412 1.6206 
−1.1255 −0.8838 −3.3876 −1.9319 
0.3695 0.2638 0.2413 0.3416 
Zahiri & Najafzadeh (2018) _ EPR 116.02 169.77 100.21 367.58 
2.1755 1.3283 2.8965 4.5951 
−0.4381 −0.1074 −0.3221 −0.2001 
0.3889 0.2140 0.2822 0.2777 
Zahiri & Najafzadeh (2018) _ GEP 170.49 57.59 26.06 16.82 
1.2800 0.6429 0.7466 0.3977 
0.0002 −0.1973 −1.5436 −1.1672 
0.0903 0.1606 0.4136 0.4057 
Das et al. (2022)  36.732 24.32 37.815 30.75 
1.3 0.422 1.624 0.751 
−8.761 −1.829 −8.176 −2.171 
0.288 0.578 0.345 0.527 
Models0.1 < β < 0.20.2 < β < 0.30.3 < β < 0.40.4< β < 0.5
LightGBM 0.363 0.244 1.26 0.671 
0.078 0.069 0.156 0.177 
0.978 0.896 0.716 0.651 
0.994 0.972 0.953 0.878 
CatBoost 0.176 0.148 0.329 0.242 
0.038 0.048 0.092 0.067 
0.994 0.955 0.99 0.896 
0.999 0.989 0.998 0.975 
XGBoost 0.121 0.184 0.306 0.245 
0.05 0.067 0.105 0.064 
0.997 0.931 0.992 0.908 
0.999 0.983 0.998 0.976 
PSO-LightGBM 0.121 0.184 0.306 0.245 
0.05 0.067 0.105 0.064 
0.997 0.931 0.992 0.908 
0.999 0.983 0.998 0.976 
PSO-XGBoost 0.326 0.332 1.067 0.735 
0.075 0.067 0.138 0.156 
0.982 0.858 0.823 0.582 
0.995 0.956 0.968 0.839 
PSO-CatBoost 0.051 0.143 0.253 0.204 
0.021 0.039 0.087 0.046 
0.999 0.957 0.995 0.928 
0.999 0.99 0.999 0.982 
Knight & Demetriou (1983)  37.053 24.763 33.912 25.752 
1.326 0.438 1.469 0.596 
−14.987 −3.284 −6.536 −1.291 
0.256 0.518 0.366 0.57 
Khatua & Patra (2007)  34.859 24.076 34.283 27.248 
1.238 0.414 1.482 0.639 
−6.34 −2.216 −6.315 −1.485 
0.318 0.553 0.366 0.556 
Devi et al. (2016)  26.929 19.632 30.023 28.016 
0.911 0.334 1.286 0.695 
−1.268 −0.907 −4.854 −1.962 
0.476 0.616 0.434 0.55 
Zahiri & Najafzadeh (2018) _ M5 98.66 65.28 52.43 60.37 
2.0992 1.0051 2.1412 1.6206 
−1.1255 −0.8838 −3.3876 −1.9319 
0.3695 0.2638 0.2413 0.3416 
Zahiri & Najafzadeh (2018) _ EPR 116.02 169.77 100.21 367.58 
2.1755 1.3283 2.8965 4.5951 
−0.4381 −0.1074 −0.3221 −0.2001 
0.3889 0.2140 0.2822 0.2777 
Zahiri & Najafzadeh (2018) _ GEP 170.49 57.59 26.06 16.82 
1.2800 0.6429 0.7466 0.3977 
0.0002 −0.1973 −1.5436 −1.1672 
0.0903 0.1606 0.4136 0.4057 
Das et al. (2022)  36.732 24.32 37.815 30.75 
1.3 0.422 1.624 0.751 
−8.761 −1.829 −8.176 −2.171 
0.288 0.578 0.345 0.527 

Note: The four values presented in each cell represent RMSE, MAPE, NSE and Id .

The Taylor diagram is one of the most useful approaches for evaluating the performance of prediction models. This diagram illustrates the most reliable and, thus, the most accurate model by comparing its distance to the reference point (actual values) (Taylor 2001). The position of a model is determined by three parameters: standard deviation (vertical and horizontal axis), correlation coefficient (radial lines) and RSME (circular lines cantered at the actual value point). The closest model to the reference point is considered the most accurate model. The proposed prediction models are compared with the help of the Taylor diagram, as shown in Figure 13. The present models are closer than empirical models to the reference point (black star), which indicates that the performance of the present models is better than existing empirical models. Empirical models have higher RMSE and lower coefficient of correlation value. The coefficient of correlation value of Das et al. (2022) has a negative value, which may be due to the diverse range of data selected in the study for converging and diverging flood plains or the limitation of the model as it is developed for only diverging channel. As reported previously in Table 7, the Taylor diagram also confirms that the present models PSO-Xgboost, PSO-CatBoost and CatBoost models have higher R2 and lower RMSE and are considered the most accurate methods for predicting discharge in compound channels. The standard deviation of the PSO-CatBoost model is closer to the observed value, which indicates that it accurately predicts the observed data than other models. So, PSO-CatBoost is considered the most reliable method for the prediction of discharge in a compound channel with converging and diverging flood plain. Shekhar et al. (2023) estimated discharge in a compound channel with converging and diverging flood plains using MARS and ANN-PSO with R2 values of 0.92 and 0.83, respectively. ML models developed in the present study perform better than MARS and ANN-PSO developed in the study of Shekhar et al. (2023). LightGBM and PSO-LightGBM models developed in the present study have comparable performance in terms of R2 with the ANN-PSO and MARS models in the study of Shekhar et al. (2023).
Figure 13

Comparison of different models by the Taylor diagram.

Figure 13

Comparison of different models by the Taylor diagram.

Close modal

Performance of all models using the SI, BIAS and DI

The SI, BIAS and DI offer insights into the performance of various ML models and empirical equations for predicting discharge in a compound channel are provided in Table 10. Among the ML models, PSO-XgBoost stands out with the lowest SI of 0.228, indicating precise predictions. Although exhibiting a negative BIAS of −1.02, suggesting a tendency for underestimation, the high DI of 0.976 signifies a strong alignment between true and predicted values. Other ML models, including LightGBM, PSO-LightGBM, XgBoost, CatBoost and PSO-CatBoost, also demonstrate favourable performance with low scatter indices and discrimination indices close to 1. Conversely, empirical equations, such as Knight & Demetriou (1983), Khatua & Patra (2007), Devi et al. (2017) and Das et al. (2022), display higher scatter indices, indicating lower accuracy in predicting discharge in a compound channel. Das et al. (2022) particularly exhibit a negative DI, suggesting a weaker correlation between true and predicted values. The observations underscore the superior performance of ML models over empirical equations in accurately predicting discharge. Despite variations in scatter indices and BIAS values, PSO-CatBoost emerges as an effective model with a low SI (0.229), minimal BIAS (−0.755) and a DI of 0.974, highlighting its robustness and reliability for discharge prediction in the compound channel. Therefore, based on these results, PSO-CatBoost is suggested as the most suitable model for discharge prediction.

Table 10

Performance matrix of models using SI, BIAS and DI

ModelScatter index (SI)BIASDiscrimination index (DI)
LightGBM 0.374 −1.4 0.932 
PSO-LightGBM 0.324 −1.24 0.949 
XgBoost 0.303 −0.962 0.954 
PSO-XgBoost 0.228 −1.02 0.976 
CatBoost 0.243 −0.926 0.972 
PSO-CatBoost 0.229 −0.755 0.974 
Knight & Demetriou (1983)  1.77 24.25 0.128 
Khatua & Patra (2007)  1.75 23.47 0.103 
Devi et al. (2017)  1.88 26.65 0.083 
Das et al. (2022)  1.63 18.41 −0.002 
ModelScatter index (SI)BIASDiscrimination index (DI)
LightGBM 0.374 −1.4 0.932 
PSO-LightGBM 0.324 −1.24 0.949 
XgBoost 0.303 −0.962 0.954 
PSO-XgBoost 0.228 −1.02 0.976 
CatBoost 0.243 −0.926 0.972 
PSO-CatBoost 0.229 −0.755 0.974 
Knight & Demetriou (1983)  1.77 24.25 0.128 
Khatua & Patra (2007)  1.75 23.47 0.103 
Devi et al. (2017)  1.88 26.65 0.083 
Das et al. (2022)  1.63 18.41 −0.002 

Performance of uncertainty, reliability and resilience analysis

Performance metrics of various ML models, shedding light on their predictive capabilities and robustness using uncertainty, reliability and resilience analysis are summarised in Table 11. LightGBM exhibits a wider confidence interval, indicating higher uncertainty, while CatBoost stands out with the highest reliability index among the non-optimised models. PSO enhances model performance by reducing uncertainty and improving reliability, as seen in PSO-LightGBM, PSO-CatBoost and PSO-XgBoost. Especially, PSO-CatBoost displays the most consistent and robust performance, boasting the lowest mean resilience index and standard deviation of the resilience index. This analysis suggests that the PSO-CatBoost Model provides a more consistent, reliable and resilient model for the prediction of discharge in a compound channel with converging and diverging flood plains.

Table 11

Performance evaluation using uncertainty, reliability and resilience analysis

ModelConfidence intervalMean resilience indexStd. dev. resilience indexReliability index
LightGBM 0.80 1.04 0.17 0.63 
PSO-LightGBM 0.70 1.03 0.17 0.58 
PSO-CatBoost 0.49 1.01 0.08 0.53 
CatBoost 0.52 1.02 0.09 0.59 
XgBoost 0.65 1.02 0.11 0.54 
PSO-XgBoost 0.49 1.02 0.09 0.57 
Knight & Demetriou (1983)  3.80 0.70 0.31 0.14 
Khatua & Patra (2007)  3.76 0.71 0.31 0.14 
Devi et al. (2016)  4.04 0.67 0.29 0.12 
Das et al. (2022)  3.50 0.79 0.41 0.22 
ModelConfidence intervalMean resilience indexStd. dev. resilience indexReliability index
LightGBM 0.80 1.04 0.17 0.63 
PSO-LightGBM 0.70 1.03 0.17 0.58 
PSO-CatBoost 0.49 1.01 0.08 0.53 
CatBoost 0.52 1.02 0.09 0.59 
XgBoost 0.65 1.02 0.11 0.54 
PSO-XgBoost 0.49 1.02 0.09 0.57 
Knight & Demetriou (1983)  3.80 0.70 0.31 0.14 
Khatua & Patra (2007)  3.76 0.71 0.31 0.14 
Devi et al. (2016)  4.04 0.67 0.29 0.12 
Das et al. (2022)  3.50 0.79 0.41 0.22 

Feature importance analysis using SHAP and sensitivity analysis

SHAP (SHapley Additive exPlanations) is a feature contribution method based on a game-theoretic approach. SHAP provides consistent interpretability, which helps to overcome the black-box aspect of several ML methods. Shapley values are a quantifiable way for SHAP to illustrate how much an input feature contributes to an output in a model. In this study, the SHAP-based feature contribution showed the importance of each feature and the corresponding contribution to the model prediction. SHAP analysis has been performed for the two best models (PSO-CatBoost and PSO-XgBoost) and results are found to be similar. Figure 14 shows the contribution of each input feature to the predicted output value of discharge of compound channel with converging and diverging flood plain. Rank-wise contribution of each input feature has been illustrated ub Figure 14 with the highest contributing feature being the Bed slope of the channel (So) and the lowest being (Xr). The Bed slope of the channel (So) and Flow aspect ratio of the main channel (δ*) have a maximum contribution to estimate discharge in a compound channel with a converging and diverging flood plain. The red colour represents a high value while the blue colour represents a lower value. For ex. δ* is mostly a high value with a negative SHAP value, which indicates that a higher δ* value negatively affects the discharge. The higher values of θ, β and Xr and the lower values of S0, δ*, Rr, α, Ar and Ff positively and negatively affect the output discharge. S0 and δ* have a wide distribution that suggests that these input feature plays a crucial role in differentiating predictions and can have a substantial impact on the discharge. Rr, α, Ar, Ff,θ, β and Xr have narrow distributions which indicates that the impact on predictions is relatively consistent and may not be as critical in explaining variations in the output discharge.
Figure 14

Effect of all input features on output illustrated by SHAP summary plot.

Figure 14

Effect of all input features on output illustrated by SHAP summary plot.

Close modal
The PSO-CatBoost model has been selected to assess the impact of individual input parameters on the model predictions. The sensitivity scores, representing the mean absolute change in predictions, when each feature was set to a constant or its mean, provide valuable insights into the relative importance of these features, as shown in Figure 15. Key observations from the analysis include the substantial impact of features channel bed slope ‘S0’ and ‘δ*,’ with sensitivity scores of 0.326 and 0.218, respectively. Additionally, ‘Rr’ exhibited a significant influence with a sensitivity score of 0.039. In contrast, features ‘α’ and ‘Xr’ had minimal impact, as indicated by their low sensitivity scores of 0.007 and 0.005, respectively. These findings highlight the varying degrees of influence that different features exert on the model prediction.
Figure 15

Result of sensitivity analysis highlighting sensitivity score of different features.

Figure 15

Result of sensitivity analysis highlighting sensitivity score of different features.

Close modal

The comparison between SHAP analysis and sensitivity analysis reveals consistent and meaningful insights into the factors influencing the estimation of discharge in a compound channel with converging and diverging floodplains. Bed Slope (So) and Flow Aspect Ratio (δ*) emerge as pivotal contributors in both analyses, with so positively impacting discharge and δ* exhibiting a negative influence. The wide distribution of So and δ* in SHAP analysis underscores their critical roles in differentiating predictions and suggests a substantial impact on discharge. Sensitivity analysis reinforces these findings by ranking So and δ* with high sensitivity scores. While other features (Rr, α, Ar, Ff, θ, β and Xr) exhibit narrower distributions in SHAP analysis, indicating consistent impacts on predictions, sensitivity analysis acknowledges their relevance by assigning lower sensitivity scores. Overall, the agreement between SHAP and sensitivity analyses enhances the reliability of the identified influential parameters, providing a comprehensive understanding of the dynamics affecting discharge estimation in compound channels.

In the present study, six ML models, namely Xgboost, LightGBM, CatBoost, PSO-XgBoost, PSO-LightGBM and PSO-CatBoost, have been developed for the prediction of discharge in a compound channel with converging and diverging flood plain. To validate the results obtained from the ML models it is compared using different statistical performance parameters with four existing empirical methods, i.e., Knight & Demetriou (1983), Khatua & Patra (2007), Devi et al. (2016) and Das et al. (2022). The following conclusions were drawn from the study.

  • The PSO-CatBoost model is found to be the better model for the prediction of discharge in compound channels out of the six ML methods used in the present study.

  • SHAP and sensitivity analysis show that the influential factor among the nine parameters considered for predicting discharge is ‘So’, while ‘Xr’ has the least impact on the prediction of discharge in a compound channel with converging and diverging plains.

  • The study highlights the importance of choosing the right model based on the width ratio (α). PSO-CatBoost is best for α > 2 (wider features), and PSO-XgBoost is superior for α < 2 (narrower features), improving predictive accuracy in various applications.

  • PSO enhances model performance by reducing uncertainty and improving reliability. PSO-Catboost displays the most efficient and robust models based on uncertainty, reliability and resilience analysis.

  • The study emphasises the superior performance of ML models, particularly PSO-CatBoost, which exhibits a low SI (0.229), minimal BIAS (−0.755) and a DI of 0.974, making it more suitable for discharge prediction in a compound channel with converging and diverging flood plains.

  • Empirical methods suffer from unsatisfactory performance primarily due to the heterogeneity of datasets collected from different researchers, leading to variability and biases. Additionally, limitations in the validity of empirical equations within specific ranges hinder accurate predictions and generalisability.

The limitation of the present study is the diverse range of datasets employed in the modelling of discharge in a compound channel with converging and diverging floods. The model will provide better results if the input parameter value lies in the same range, as mentioned in Table 1. Further research is needed to improve the accuracy of the model by incorporating more datasets on converging and diverging compound channels.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abril
J. B.
&
Knight
D. W.
2004
Stage-discharge prediction for rivers in flood applying a depth-averaged model
.
Journal of Hydraulic Research
42
(
6
),
616
629
.
Al-Khatib
I. A.
,
Hassan
H. A.
&
Abaza
K. A.
2013
Application and validation of regression analysis in the prediction of discharge in asymmetric compound channels
.
Journal of Irrigation and Drainage Engineering
139
(
7
),
542
550
.
Azamathulla
H. M.
&
Zahiri
A.
2012
Flow discharge prediction in compound channels using linear genetic programming
.
Journal of Hydrology
454
,
203
207
.
Bousmar
D.
2002
Flow modelling in compound channels
.
Unire de Genie Civil et Environnemental
.
Bousmar
D.
&
Zech
Y.
1999
Momentum transfer for practical flow computation in compound channels
.
Journal of Hydraulic Engineering
125
(
7
),
696
706
.
Bousmar
D.
&
Zech
Y.
2004
Velocity distribution in non-prismatic compound channels
. In:
Proceedings of the Institution of Civil Engineers-Water Management
, Vol.
157
, No.
2
.
Thomas Telford Ltd.
, pp.
99
108
.
Bousmar
D.
,
Proust
S.
&
Zech
Y.
2006
Experiments on the flow in an enlarging compound channel
. In
River Flow 2006: Proceedings of the International Conference on Fluvial Hydraulics
,
6–8 September 2006
,
Lisbon, Portugal
.
Taylor and Francis
,
Leiden, Netherlands
, pp.
323
332
.
Chen
T.
&
Guestrin
C.
2016
Xgboost: A scalable tree boosting system
. In:
Proceedings of the 22nd acm Sigkdd International Conference on Knowledge Discovery and Data Mining
, pp.
785
794
.
Chlebek
J.
2009
Modelling of simple prismatic channels with varying roughness using the SKM and a study of flows in smooth non-prismatic channels with skewed floodplains (Doctoral dissertation, University of Birmingham)
.
Das
B. S.
2018
Non-uniform flow modelling in compound channels with non-prismatic floodplains (Doctoral dissertation)
.
Das
B. S.
&
Khatua
K. K.
2018
Flow resistance in a compound channel with diverging and converging floodplains
.
Journal of Hydraulic Engineering
144
(
8
),
04018051
.
Das
B. S.
,
Devi
K.
,
Proust
S.
&
Khatua
K. K.
2018
Flow distribution in diverging compound channels using improved independent subsection method
.
River Flow 2018: 9th International Conference on Fluvial Hydraulics
40
,
05068
.
Das
B. S.
&
Khatua
K. K.
2019
Water surface profile computation for compound channel having diverging floodplains
.
ISH Journal of Hydraulic Engineering
25
(
3
),
336
349
.
Das
B. S.
,
Devi
K.
,
Khuntia
J. R.
&
Khatua
K. K.
2020
Discharge estimation in converging and diverging compound open channels by using adaptive neuro-fuzzy inference system
.
Canadian Journal of Civil Engineering
47
(
12
),
1327
1344
.
Das
B. S.
,
Devi
K.
,
Khuntia
J. R.
&
Khatua
K. K.
2022
Flow distributions in a compound channel with diverging floodplains
.
River Hydraulics: Hydraulics, Water Resources and Coastal Engineering
2
,
113
125
.
Devi
K.
,
Khatua
K. K.
&
Khuntia
J. R.
2016
Prediction of mixing layer in symmetric and asymmetric compound channels
. In:
River Flow
, pp.
39
47
.
Devi
K.
,
Khatua
K. K.
,
Das
B. S.
&
Khuntia
J. R.
2017
Evaluation of interacting length in prediction of over bank flow
.
ISH Journal of Hydraulic Engineering
23
(
2
),
187
194
.
Devi
K.
,
Das
B. S.
,
Khuntia
J. R.
&
Khatua
K. K.
2021
Analytical solution for depth-averaged velocity and boundary shear in a compound channel
.
Water Management
174
(
3
),
143
158
.
Dorogush
A. V.
,
Ershov
V.
&
Gulin
A.
2018
CatBoost: gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363
.
Eini
N.
,
Bateni
S. M.
,
Jun
C.
,
Heggy
E.
&
Band
S. S.
2023
Estimation and interpretation of equilibrium scour depth around circular bridge piers by using optimized XGBoost and SHAP
.
Engineering Applications of Computational Fluid Mechanics
17
(
1
),
2244558
.
Fernandes
J. N.
,
Leal
J. B.
&
Cardoso
A. H.
2015
Assessment of stage–discharge predictors for compound open-channels
.
Flow Measurement and Instrumentation
45
,
62
67
.
Friedman
J. H.
2001
Greedy function approximation: A gradient boosting machine
.
Annals of Statistics
1
,
1189
1232
.
James
M.
&
Brown
B. J.
1977
Geometric parameters that influence floodplain flow
.
Department of Defense, Department of the Army, Corps of Engineers, Waterways Experiment Station, Hydraulics Laboratory
.
Ke
G.
,
Meng
Q.
,
Finley
T.
,
Wang
T.
,
Chen
W.
,
Ma
W.
,
Ye
Q.
&
Liu
T. Y.
2017
Lightgbm: A highly efficient gradient boosting decision tree
.
Advances in Neural Information Processing Systems
30
.
Kennedy
J.
&
Eberhart
R.
1995
Particle swarm optimization
. In:
Proceedings of ICNN'95-International Conference on Neural Networks
.
IEEE
, pp.
1942
1948
Khatua
K. K.
&
Patra
K. C.
2007
Boundary shear stress distribution in compound open channel flow
.
ISH Journal of Hydraulic Engineering
13
(
3
),
39
54
.
Khatua
K. K.
,
Patra
K. C.
&
Mohanty
P. K.
2012
Stage-discharge prediction for straight and smooth compound channels with wide floodplains
.
Journal of Hydraulic Engineering
138
(
1
),
93
99
.
Khuntia
J. R.
,
Devi
K.
&
Khatua
K. K.
2018
Boundary shear stress distribution in straight compound channel flow using artificial neural network
.
Journal of Hydrologic Engineering
23
(
5
),
04018014
.
Kilinc
I.
,
Cigizoglu
H. K.
&
Zuran
A.
2000
A comparison of three methods for the prediction of future streamflow data. Technical and Documental Research of 14th Regional Directorate, State Hydraulic Works (DSI), Istanbul, Turkey
.
Knight
D. W.
&
Demetriou
J. D.
1983
Floodplain and main channel flow interaction
.
Journal of Hydraulic Engineering
109
(
8
),
1073
1092
.
Knight
D. W.
,
Shiono
K.
&
Pirt
J.
1989
Prediction of depth mean velocity and discharge in natural rivers with overbank flow
. In:
Proceedings of the International Conference on Hydraulic and Environmental Modellling of Coastal, Estuarine and River Waters
.
Gower Publishing
, pp.
419
428
.
Lambert
M. F.
&
Myers
W. R.
1998
Estimating the discharge capacity in straight compound channels
.
Proceedings of the Institution of Civil Engineers-Water Maritime and Energy
130
(
2
),
84
94
.
Liao
H.
&
Knight
D. W.
2007
Analytic stage-discharge formulas for flow in straight prismatic channels
.
Journal of Hydraulic Engineering
133
(
10
),
1111
1122
.
Lundberg
S. M.
&
Lee
S. I.
2017
A unified approach to interpreting model predictions
.
Advances in Neural Information Processing Systems
30
.
Mehrabani
F.
,
Mohammadi
M.
,
Ayyoubzadeh
S. A.
,
Fernandes
J. N.
&
Ferreira
R. M.
2020
Turbulent flow structure in a vegetated non‐prismatic compound channel
.
River Research and Applications
36
(
9
),
1868
1878
.
Maghrebi
M. F.
2006
Application of the single point measurement in discharge estimation
.
Advances in Water Resources
29
(
10
),
1504
1514
.
Myers
W. R.
1987
Velocity and discharge in compound channels
.
Journal of Hydraulic Engineering
113
(
6
),
753
766
.
Naik
B.
&
Khatua
K. K.
2016
Boundary shear stress distribution for a converging compound channel
.
ISH Journal of Hydraulic Engineering
22
(
2
),
212
219
.
Najafzadeh
M.
&
Anvari
S.
2023
Long-lead streamflow forecasting using computational intelligence methods while considering uncertainty issue
.
Environmental Science and Pollution Research
30
(
35
),
84474
84490
.
Najafzadeh
M.
&
Oliveto
G.
2020
Riprap incipient motion for overtopping flows with machine learning models
.
Journal of Hydroinformatics
22
(
4
),
749
767
.
Najafzadeh
M.
,
Rezaie Balf
M.
&
Rashedi
E.
2016
Prediction of maximum scour depth around piers with debris accumulation using EPR, MT, and GEP models
.
Journal of Hydroinformatics
18
(
5
),
867
884
.
Pandey
H. K.
,
Singh
V. K.
,
Srivastava
S. K.
&
Singh
R. P.
2023
Groundwater quality assessment using PCA and water quality index (WQI) in a drought-prone area
.
Sustainable Water Resources Management
9
(
6
),
197
.
Patra
K. C.
&
Khatua
K. K.
2006
Selection of interface plane in the assessment of discharge in two stage meandering and straight compound channels
. In:
Proceeding of the International Conference on Fluvial Hydraulics (IAHR), River Flow2006
,
Lisbon
, pp.
379
387
.
Prokhorenkova
L.
,
Gusev
G.
,
Vorobev
A.
,
Dorogush
A. V.
&
Gulin
A.
2018
Catboost: Unbiased boosting with categorical features
.
Advances in Neural Information Processing Systems
31
.
Proust
S.
2005
Ecoulements non-uniformes en lits composés: effets de variations de largeur du lit majeur. Non-uniform flows in composite beds: effects of variations in the width of the major bed.] Doctoral dissertation, INSA de Lyon
.
Proust
S.
,
Bousmar
D.
,
Riviere
N.
,
Paquier
A.
&
Zech
Y.
2010
Energy losses in compound open channels
.
Advances in Water Resources
33
(
1
),
1
16
.
Rezaei
B.
2006
Overbank flow in compound channels with prismatic and non-prismatic floodplains (Doctoral dissertation, University of Birmingham)
.
Rezaei
B.
&
Knight
D. W.
2011
Overbank flow in compound channels with nonprismatic floodplains
.
Journal of Hydraulic Engineering
137
(
8
),
815
824
.
Shekhar
D.
,
Das
B. S.
,
Devi
K.
,
Khuntia
J. R.
&
Karmaker
T.
2023
Discharge estimation in a compound channel with converging and diverging floodplains using ANN–PSO and MARS
.
Journal of Hydroinformatics
25
(
6
),
2479
2499
.
Shrestha
R. R.
2005
Simulation of flood flow in a river system using artificial neural network
.
Hydrology and Earth System Sciences
6
(
4
),
671
684
.
Taylor
K. E.
2001
Summarizing multiple aspects of model performance in a single diagram
.
Journal of Geophysical Research: Atmospheres
106
(
D7
),
7183
7192
.
Wormleaton
P. R.
,
Allen
J.
&
Hadjipanos
P.
1982
Discharge assessment in compound channel flow
.
Journal of the Hydraulics Division
108
(
9
),
975
994
.
Wu
K.
,
Chai
Y.
,
Zhang
X.
&
Zhao
X.
2022
Research on Power Price Forecasting Based on PSO-XGBoost
.
Electronics
11
(
22
),
3763
.
Yang
K.
,
Liu
X.
,
Cao
S.
&
Huang
E.
2014
Stage-discharge prediction in compound channels
.
Journal of Hydraulic Engineering
140
(
4
),
06014001
.
Yonesi
H. A.
,
Omid
M. H.
&
Ayyoubzadeh
S. A.
2013
The hydraulics of flow in non-prismatic compound channels
.
Journal of Civil Engineering and Urbanism
3
(
6
),
342
356
.
Yonesi
H. A.
,
Parsaie
A.
,
Arshia
A.
&
Shamsi
Z.
2022
Discharge modeling in compound channels with non-prismatic floodplains using GMDH and MARS models
.
Water Supply
22
(
4
),
4400
4421
.
Yu
J.
,
Zheng
W.
,
Xu
L.
,
Zhangzhong
L.
,
Zhang
G.
&
Shan
F.
2020
A PSO-XGBoost model for estimating daily reference evapotranspiration in the solar greenhouse
.
Intelligent Automation & Soft Computing
26
(
5
),
989
1003
.
Zahiri
A.
&
Dehghani
A. A.
2009
Flow discharge determination in straight compound channels using ANNs
.
International Journal of Computer and Information Engineering
3
(
10
),
2331
2334
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).