## ABSTRACT

Urban flooding has made it necessary to gain a better understanding of how well gully pots perform when overwhelmed by solids deposition due to various climatic and anthropogenic variables. This study investigates solids deposition in gully pots through the review of eight models, comprising four deterministic models, two hybrid models, a statistical model, and a conceptual model, representing a wide spectrum of solid depositional processes. Traditional models understand and manage the impact of climatic and anthropogenic variables on solid deposition but they are prone to uncertainties due to inadequate handling of complex and non-linear variables, restricted applicability, inflexibility and data bias. Hybrid models which integrate traditional models with data-driven approaches have proved to improve predictions and guarantee the development of uncertainty-proof models. Despite their effectiveness, hybrid models lack explainability. Hence, this study presents the significance of eXplainable Artificial Intelligence (XAI) tools in addressing the challenges associated with hybrid models. Finally, crossovers between various models and a representative workflow for the approach to solids deposition modelling in gully pots is suggested. The paper concludes that the application of explainable hybrid modeling can serve as a valuable tool for gully pot management as it can address key limitations present in existing models.

## HIGHLIGHTS

Existing models are presented and discussed.

Integrating data-driven and traditional models enhances performance.

Explainability could pose a challenge to adopting hybrids.

A review study is conducted on crossovers between different models to explore their limitations and propose potential improvement.

A workflow is developed to address the challenges associated with the implementation of explainable hybrids for prediction.

## INTRODUCTION

*et al.*2018). The kerb and gully drainage systems (Figure 1) are one of the most common forms of road drainage used in the United Kingdom.

^{2}of road surface (Department for Transport 2020). In practice 100 gullies may be expected to yield 7 m

^{3}of debris (Butler

*et al.*2018). The gully grate provides cover to a buried component known as the gully pot (Figure 2).

Surface runoff and underground drainage networks are connected by these gully pots. They are designed to minimise solids deposition in drainage systems which contribute to blockages in the drainage network, reduce sewer system efficiency, urban flooding and increased pollution into water bodies (Forty 1998; British Standards Institution 2021). Trapped gully pots have a solids collector, sometimes known as a solids trap. Solids traps capture sediments that could otherwise escape through the grate. The deposition of solids in road gullies is influenced by climatic- and anthropogenic-driven processes, which can be categorised into three phases: Solids Build-Up (SB), Solids Wash-Off (SW), and Solids Retention (SR) (Rietveld *et al.* 2020b). SB processes are mostly time-dependent and include variables in the contributing area such as: traffic intensity, road surface type, solids particle size, and street sweeping frequency. SW processes are mostly climate-dependent and include: rainfall amount and intensity, surface runoff, wind action and temperature. SR processes directly impact the accumulation of solids in gully pots including: variables of flow rate, gully cross-section, depth of solids trap, solids type and gully filling degree or position of outlet pipes. It is important to assess techniques for predicting the interaction between these processes to better understand their potential effects on the deposition of solids in gully pots (Rietveld *et al.* 2020b).

To find an optimised approach to the deposition of solids in gully pots, conceptual, deterministic, statistical, and hybrid models need to be further explored to determine the differences and overlap (Obropta & Kardos 2007). For example, a deterministic model may include some stochastic elements to account for uncertainty or variability in the system. Similarly, a statistical model may use deterministic equations to model the relationship between variables. Since the boundaries between models are not always clear, it is important to understand the strengths and limitations of these models. By understanding the crossovers between them, they can be integrated to produce explainable hybrids that offer improved prediction accuracy. Following the above discussion, the objectives of this work are as follows:

Highlight the strengths and limitations of existing models that predict solids deposition processes in gully pots.

Explore the fusion of data-driven and traditional models to develop hybrid models that enhance the overall performance.

Consider the challenges involved in deploying hybrid models for solid deposition prediction.

Develop a clear workflow for explainable hybrid modelling.

The remaining sections of this study are organised as follows: Section 2 describes solids deposition procedures in gully pots. Section 3 presents a review of existing models for the prediction of solids deposition. The limitations in utilising existing models are discussed in Section 4 along with how explainable hybrid models can be used to enhance them. Finally, Section 5 explores ways to ensure that the hybrid models used for prediction are explainable to all stakeholders.

## DEPOSITION OF SOLIDS IN GULLY POTS – DESCRIPTION OF PROCESSES

The amount of deposited solids in gully pots is influenced by climatic and anthropogenic processes. These processes are divided into three major phases as described below.

SB processes within a defined catchment area give rise to accumulation of debris, sediments and other particles that are eventually transported by wash-off processes into gully pots. Some of these time-dependent processes include characteristics of the contributing area such as leaf fall (Nix 2002), road slope (Muthusamy *et al.* 2018), traffic intensity (Chow *et al.* 2015), road surface roughness (Zhao *et al.* 2018), particle size distribution, mass and density (Xiao *et al.* 2022), and street sweeping frequency (Egodawatta *et al.* 2013).

SW processes are those by which solids are transported into gully pots. They are mostly climate-dependent and include contributing area, rainfall characteristics (Sartor *et al.* 1974; Egodawatta *et al.* 2007), surface runoff (Zhao *et al.* 2018), wind action (Butler & Karunaratne 1995), daily sunshine hours and solar radiation (Nix 2002), temperature (Post *et al.* 2016), antecedent dry weather period (ADWP) (Post *et al.* 2016; Rietveld *et al.* 2020b), solids geometry, which include initial sediment load and particle size distribution, mass and density (Grottker & Hurlebush 1987; Butler & Karunaratne 1995) and street sweeping frequency.

SR processes that directly impact accumulation of solids in gully pots are the result of their design geometry (Post *et al.* 2016) and flow rate (Deletic *et al.* 1997). These processes have the potential to reduce a gully's hydraulic capacity at any given time and can impact on a gully's trapping efficiency. SR processes include contributing area, rainfall characteristics, gully cross-sectional area and depth of solid trap or gully pot (Post *et al.* 2016), gully grate design pattern (Rietveld *et al.* 2020a), solids geometry and type, and gully filling degree/position of outlet pipes (Post *et al.* 2016; Rietveld *et al.* 2020b).

SB over a catchment is washed-off by rainfall and other sediment-transport processes. It is then transferred through gully pots into sewers although due to their trapping efficiency, gully pots can capture and retain these pollutants through retention variables implying that, SB, SW, and SR processes are interrelated by a range of overlapping variables. Contributing area is an example variable relevant to all processes, while specific variables may be important to individual processes.

## REVIEW OF THE EXISTING MODELLING TECHNIQUES FOR THE PREDICTION OF SOLIDS DEPOSITION

To understand and manage the impact of climatic and anthropogenic changes on SB, SW, and SR processes, several models have been developed. However, certain models lack the resilience required to handle the uncertainty that arises from limitations, such as scope and applicability (Litwin & Donigian 1978; Driver & Troutman 1989), the use of complex and non-linear variables (Bertrand-Krajewski *et al.* 1993), inflexibility due to reliance on fixed constants and processes (Grottker & Hurlebush 1987), bias from the use of limited and erroneous variables (Sartor *et al.* 1974; Egodawatta *et al.* 2007), sensitivity to outliers, precision and data mismatch (Rietveld *et al.* 2020b).

Although some models are developed to recognise patterns in non-linear and complex problems and handle uncertainty, they may not be explainable (Lundberg & Lee 2017; Geng *et al.* 2022), thereby leading to the risk of misapplication (Almutairi *et al.* 2021). Models such as bootstrap aggregating and Adaptive Boosting (AdB) (Behrouz *et al.* 2022), generally referred to as ensemble learning, can explicitly integrate deterministic, statistical and stochastic models and potentially exploit the advantages of each approach to reduce prediction error and uncertainty. Other useful hybrid models in this regard include Artificial Neural Networks (ANNs), Random Forests (RF) (Breiman 2001), Gradient Boosting Machines (GBM) (Friedman 2001), and Monte Carlo simulations. Nevertheless, these hybrid models are not explainable and may require complex and resource-intensive computations (Clark 2005; Gelman & Hill 2006; Post *et al.* 2015; Lee *et al.* 2021). XAI involves developing a model that humans can explain. It is important for ensuring trust, accountability and transparency in the model.

This study reviewed eight models applied in literature for the study of solids deposition in gully pots and beyond. The reviewed models are first summarised in Table 1. These models are subsequently discussed in Sections 3.1–3.3.

s/n . | Model . | Type . | Deposition phase . |
---|---|---|---|

1 | Stormwater Management Model (SWMM) (Sartor et al. 1974; Alley & Smith 1981) | Deterministic | SB |

2 | Butler and Karunaratne's gully pot trapping efficiency model (Butler & Karunaratne 1995) | Deterministic | SR |

3 | Grottker's soilds retention model (Grottker & Hurlebush 1987; Grottker 1990; Butler & Karunaratne 1995) | Deterministic | SR |

4 | Modified Sartor and Boyd's model (Sartor et al. 1974; Egodawatta et al. 2007) | Deterministic and statistical | SW |

5 | The Non-Point Source model (NPS) (Litwin & Donigian 1978) | Conceptual | SW |

6 | Driver and Troutman's model (Driver & Troutman 1989) | Statistical | SW |

7 | Debris flow volume model (Lee et al. 2021) | Hybrid | SW |

8 | Gully pot sediment accumulation model (Post et al. 2015, 2016) | Hybrid | SR |

s/n . | Model . | Type . | Deposition phase . |
---|---|---|---|

1 | Stormwater Management Model (SWMM) (Sartor et al. 1974; Alley & Smith 1981) | Deterministic | SB |

2 | Butler and Karunaratne's gully pot trapping efficiency model (Butler & Karunaratne 1995) | Deterministic | SR |

3 | Grottker's soilds retention model (Grottker & Hurlebush 1987; Grottker 1990; Butler & Karunaratne 1995) | Deterministic | SR |

4 | Modified Sartor and Boyd's model (Sartor et al. 1974; Egodawatta et al. 2007) | Deterministic and statistical | SW |

5 | The Non-Point Source model (NPS) (Litwin & Donigian 1978) | Conceptual | SW |

6 | Driver and Troutman's model (Driver & Troutman 1989) | Statistical | SW |

7 | Debris flow volume model (Lee et al. 2021) | Hybrid | SW |

8 | Gully pot sediment accumulation model (Post et al. 2015, 2016) | Hybrid | SR |

### Deterministic modelling techniques

Deterministic models are based on mathematical equations. Therefore, they may not be applicable to all types of processes, for example the complex processes involved in solids deposition (Bertrand-Krajewski *et al.* 1993). These models struggle with complex variables, since they are limited in the context of scope and applicability, inflexibility, data limitations, bias, and sensitivity to outliers. These limitations further aggravate uncertainty, which are often calibrated by trial and error (Alley & Smith 1981), with little understanding of the models' sensitivity to the variables driving solid deposition. Deletic *et al. (*1997) and Rietveld *et al.* (2020b) further suggested that deterministic models may not adequately handle missing data or measurement errors, leading to inaccurate predictions or inferences. Furthermore, Bertrand-Krajewski *et al.* (1993) stated that the precision of a deterministic model relies on how well the calculated values agree with the observed values. This agreement can be measured through objective functions such as the mean square error (MSE) and least square method (LSM) (Egodawatta *et al.* 2007). Nonetheless, this precision accounts for discrepancies such as measurement errors in sampling and the assumption that the deterministic model is only a rough approximation of the complex physical processes.

*et al.*1974; Alley & Smith 1981). Software applications such as InfoSWMM (Environmental Systems Research Institute n.d.) and SWMM5 (United States Environmental Protection Agency 2023) have deployed the operational principles of this model to simulate SB. Suárez

*et al.*(2013) utilised SWMM5 to develop a sand filter system for managing highway runoff by analysing SB and SW within a catchment.where represents the accumulated mass of solids at time

*t*(kg); is the daily accumulation rate (kg/d); is the disappearing coefficient (d

^{−1}); and are coefficients that should be calibrated for every catchment prior to the use of the model, since is influenced by a series of anthropogenic and hydrometeorological patterns such as wind action, traffic intensity, street sweeping frequency, contributing area (land use) and ADWP. Equation (1) considers SB and SW variables in the deposition process but cannot confidently comprehend the significance of ADWP which is an important variable that influences first flush (Bach

*et al.*2010).

*g*is the acceleration due to gravity (m/s);

*d*represents solids particle diameter (mm);

*D*is the gully pot diameter (mm);

*Q*is the flow rate (L/s);

*v*is kinematic viscosity (m

^{2}/s); and

*S*is the particle specific gravity (PSG). However, Equation (2) accounts for uniform and laminar flows, which will result in an inaccurate estimation of solid transport and trapping efficiency due to lack of consideration for turbulence. Additionally, the model assumes that solid particles are spherical, which can be unrealistic since particles can have non-uniform shapes (Collinson

*et al.*2006). Rietveld

*et al.*(2020b) highlighted the need for a more comprehensive understanding of the interactions between the limitations in Equation (2), the deposition of solids and the solids' depth in the gully pot to enhance the model's accuracy.

*Q*is the discharge through the gully pot (L/s) and are solids geometry (diameter) – dependent numerical coefficients within the range of . It has been further argued that the proposed range of fixed numerical coefficients

*x*and

*y*might not always represent real-world scenarios indicating that variations from these fixed values could significantly impact the model's accuracy (Bertrand-Krajewski

*et al.*1993). As a result, deterministic models may be rigid because of their reliance on fixed constants and processes, making it challenging to adapt the model to changes in processes or new data.

*et al.*(1974)'s deterministic pollutant wash-off model given in Equation (4) assumes that every storm event has the capacity to remove all the available solids from a given surface, if the storm continues for an adequate duration. Egodawatta

*et al.*(2007) further experimented with Equation (4) to replicate actual wash-off behaviours and proposed the modification, represented by Equation (5), which includes the capacity factor parameter :

*et al.*(2007) also cautioned that wrong assumptions could introduce uncertainty into the modified model (Equation (5)) due to the incorporation of :where represents the fraction of SW after a storm event;

*W*is the weight of solids mobilised after time

*t*; is the initial weight of solids on the surface, represents the capacity factor; is a wash-off coefficient (mm

^{−1}); and

*l*is rainfall intensity (mm/hr). Egodawatta

*et al.*(2007) used the LSM to determine the optimal values of

*k*and . Moreover, they evaluated Equation (5) at three sites using statistical techniques such as mean and coefficient of variation to understand each site's characteristic data. The coefficient of variation revealed significant inaccuracies in data estimation due to the use of non-representative build-up data, for each site. Xiao

*et al.*(2022) investigated the applicability of Equation (5) for solids transport rate and the influence of particle size distribution on wash-off. They recommended the calibration of parameters

*k*and for different particle size distributions on a road surface to reduce model's uncertainty.

### Conceptual modelling techniques

*t*(kg/m

^{2}); represents the accumulated solids over a time period

*t*(kg/m

^{2});

*k*is a wash-off coefficient; is surface runoff on impervious area (mm); and

*c*is a numerical coefficient.

NPS model development was generalised to consider non-point pollutants from a maximum of five land use categories which include urban, agricultural, forested, and construction areas, whilst interfacing with water quality parameters of temperature, dissolved oxygen, suspended solids, and biochemical oxygen demand (Litwin & Donigian 1978). The model also considers seasonal variables stemming from construction activities, gritting, and leaf fall. The NPS model aids in estimating the solids transported by runoff and deposited into gully pots. This helps assess how well gully pots perform in reducing solids deposition in drainage systems and minimising pollution in receiving water bodies (Post *et al.* 2016). However, the mathematical representation of solids deposition and wash-off require rigorous and separate simulation and evaluation. In the absence of sufficient data, this introduces algorithm complexity and to address this, a simplified representation of processes controlling non-point pollution must be established. This raises the dilemma of trading algorithm complexity for reduced uncertainty and simplicity of application.

Bertrand-Krajewski *et al.* (1993) further argued that the simplified representation of processes used in the NPS Model (Equations (6) and (7)) may not accurately reflect the reality of SW and pollutant transport. This is due to its failure to account for the temporal fluctuations in precipitation, runoff, and pollutant concentrations commonly observed in hydrometeorological variables. As a result, the NPS model must be meticulously calibrated whenever it is applied to a new watershed which is a time-consuming and complex process (Yuan *et al.* 2020).

### Hydrid modelling techniques

*A*is catchment area (km

^{2}); and is rainfall duration (minutes). Linear regression models were developed for three different regions which were delineated based on mean annual rainfall to improve the accuracy of the models. However, the validity of the model is limited to arid western United States where annual rainfall is less than 500 mm.

According to Kunin *et al.* (2019) and Burden & Winkler (2008), Bayesian regularisation is a mathematical process that converts a non-linear regression into a statistical problem. Using a Bayesian Regularised Artificial Neural Network (BRANN) model, Lee *et al.* (2021) conducted an analysis of solids deposition prediction accuracy using historical extreme rainfall events and 15 climatic and anthropogenic solids variables, i.e. a hybrid model. They found that BRANN had a higher accuracy with a coefficient of determination (R^{2}) of 0.911, compared to the Multiple Linear Regression Equations (MLRE) of Marchi & D'Agostino (2004) and Chang *et al.* (2011), which had R^{2} values of 0.693, 0.688, and 0.670. Although Lee *et al.* (2021)'s research is not unique to solids deposition prediction in gully pots. BRANN model has been applied in natural gas explosion risk analysis (Shi *et al.* 2019), rainfall prediction model for debris flow (Zhao *et al.* 2022) and optimisation of diesel engine combustion events (Ankobea-Ansah & Hall 2022). BRANN combines the deterministic Bayesian Regularisation Algorithm (BRA) with the stochastic ANN and is known to improve the prediction accuracy of complex systems, incorporate stochasticity, handle uncertainty, and capture data variability (Papananias *et al.* 2017).

*w*is a vector of the model parameters; is the loss function that measures the difference between the predicted values of the model and the actual values of the data; is the regularisation term that penalises the complexity of the model by adding a penalty term that increases as the magnitude of the model parameters; and , the regularisation parameter that controls the trade-off between the fit of the model to the data and the complexity of the model.

*et al.*2022). ANN, for example, contains several layers of nodes and hidden neurons that can detect intricate patterns and relationships in data, but their architecture and the use of hidden layers (Figure 4) make it difficult to interpret how the network arrives at its final predictions. Lee

*et al.*(2021)'s ANN achieved better prediction accuracy than various MLRE, but utilised unexplainable neurons in the hidden layer.

Apart from its ‘black box’ nature, complexity (Uzair & Jamil 2020), weight initialisation (Manish Agrawal *et al.* 2021), and use of activation functions (Uzair & Jamil 2020; Brownlee 2021) are some of the reasons why the hidden layer tends to be less explainable. According to Uzair & Jamil (2020), hidden layers are designed to perform non-linear transformations of the inputs entered into the network. These transformations can become increasingly complex as more hidden layers are added to the network.

*n*input nodes , denoted by vector , a hidden layer with

*m*input nodes (

*m*+ 1“bias”), and an output layer. The additional node with the value,

*b*is called the bias node, which is a scalar value. The ‘arrows’, (Figure 5) are a combination of weights which represent the impact of a preceding node on the next node. Using Figure 5 as an example, the inputs contribute weights , to the weighted sum, to each of the node in the hidden layer which has a predefined activation function, . The activation function defines if the receiving node will be activated or how active it will be (Brownlee 2021).

The weights () can be randomly assigned, fine-tuned and calibrated through the process of back propagation (Ognjanovski 2019). In any case, the weights can be difficult to interpret and understand (Manish Agrawal *et al.* 2021). Furthermore, the neurons in the hidden layer take in a set of weighted input and produce an output through an activation function, whose choice can have a significant impact on the behaviour of the hidden layer. Some activation functions such as Rectified Linear Unit (ReLU) and Softmax can be more difficult to interpret than others (Agarap 2018) and misleading (Ozbulak *et al.* 2018). Simos & Tsitouras (2021) proposed a modification to the commonly used non-linear activation function, Hyperbolic Tangent (tanh), with the aim of reducing the computational complexity of neural networks (NNs).

*y*is a column vector, the target variable;

*X*is a matrix of the

*i*predictor variables; is a column vector of the fixed-effects regression coefficients (the ); is the design matrix for the

*j*random effects (the random complement to the fixed ); is a vector of the random effects (the random complement to the fixed ); and is a column vector of the residuals, that part of that is not explained by the model, .

From Equation (11), the target variable *y* is a linear combination of the fixed-effects part (), the random effects part (), and the residuals (). The equation assumes a linear model with fixed and random effects, and the residuals are assumed to be independent and identically distributed with mean zero. The GLMM thus incorporates both deterministic ( and stochastic () elements (Penn State University n.d.).

To clarify further, is the residual or the random error component of the model. It captures the variability in the target variable *y* that is not explained by the fixed-effects predictor variables *X* and the random effect variables *Z*. The residual introduces the stochastic or random element in the equation, accounting for the unexplained variability in the data (uncertainty), which can arise from measurement bias or unobserved variables. To improve the predictive performance and uncertainty quantification of the GLMM, an autoregressive (AR) component within a Bayesian framework (a stochastic process) is incorporated. This is achieved by using priors and posterior distributions as shown in Equations (12) and (13).

*p*represents the order of the AR process, the model can be written as:where represents the target variable at time

*t*; are the AR coefficients; is the residual term at time

*t*, assumed to follow a specific distribution.

Given the observed data , …., , some Bayesian inference techniques, such as Gibbs sampling (Casella & George 1992) and Metropolis-Hastings algorithms (Chib & Greenberg 1995) that are based on Markov Chain Monte Carlo (MCMC) methodology can be used to obtain posterior distributions for the parameters. These posterior distributions provide information about the uncertainty in the estimates. Thus, AR models a time series as a linear combination of its previous values. By acknowledging uncertainty about the model parameters, this approach enables probabilistic predictions about future values of the time series (Martin *et al.* 2021).

The PyMC3 (Salvatier *et al.* 2016) is a probabilistic model which can be used to define the GLMM with a binary response variable and an AR component, and then sample from the posterior distribution using MCMC methods. For multiclass response variables, a multinomial logistic regression model with an AR component could be deployed (Chan 2023).

Post *et al.* (2015) combined the GLMM with an AR component from a Bayesian perspective. Their objective was to examine the impact of geometrical and catchment variables (*c.f.*Post *et al.* 2015) on the filling rates of gully pots, based on monthly measurements of solid bed levels from 300 gully pots for one year. Their results provided insights into the effect of different designs on accumulation in gully pots, allowing for better optimisation of maintenance activities and improved gully pot design.

Post *et al.* (2015)’s Bayesian approach over the quasi-likelihood technique may not accurately represent the true underlying distribution of the data, as it can be prone to inaccuracies, is sensitive to outliers and not well-suited for modelling non-linear relationships between variables (Spiegelhalter *et al.* 2002). Therefore, by utilising a combination of GLMM and AR from a Bayesian perspective, their model development was able to effectively capture the complex time series data that exhibited both temporal autocorrelation and dependence or clustering of observations.

However, Clark (2005) and Gelman & Hill (2006) suggested that the combination of GLMM and AR from a Bayesian perspective can lead to increased computational complexity and intensity. Also, the model results interpretation can be more challenging for some researchers and stakeholders not familiar with Bayesian statistics.

## ADDRESSING THE LIMITATIONS IN THE PREDICTION OF SOLIDS DEPOSITION THROUGH THE USE OF HYBRID MODELS

Conceptual and deterministic models rely on physically based equations and linear models to demonstrate SB over a catchment, SW and passage through gully pots where the solids are either deposited or transferred to sewers. However, these models are limited due to uncertainties arising from scope and applicability, precision, inflexibility, data limitations, bias, and sensitivity to outliers. Statistical models, which assume normality and linearity often rely on linear regression, also face similar limitations when dealing with complex relationships or non-linear patterns in data, as reported by Marchi & D'Agostino (2004), Chang *et al.* (2011), and Lee *et al.* (2021). Although hybrid models have been employed to handle uncertainties, they create new issues such as risk of misapplication, computational complexity and intensity, poor model interpretability and explainability. Therefore, the use and benefits of XAI tools will be discussed in this section to address the limitations of these hybrid models. Table 2 presents a summary of crossovers between models and how they may be addressed.

Model type . | Traditional . | Hybrid . | Data-driven . | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Deterministic . | Conceptual . | Deterministic . | BRANN . | GLMM-AR . | Statistical . | |||||

Deposition phase . | SW . | SR . | SB . | SW . | SR . | SW . | ||||

Explanatory variables | Rainfall intensity, kinetic energy of rainfall and characteristics of solids | Catchment area, solids accumulation rate, and ADWP | Surface runoff and contributing area (land use) | Rainfall intensity | Flow rate, PSG, diameter of gully pot, diameter and depth of solids, and kinematic viscosity | Rainfall amount and flow rate | ACCU and DISP (dependent on land use, ADWP etc). | Variables of morphology, rainfall and geology | Road type, depth of trap, contributing surface area, catchment slope, position of outlet pipe, and presence of water seal | Catchment area and rainfall attributes |

Model constraint | Bias | Complex and non-linear variables | Inflexibility | Complex and non-linear variables | Explainability and transparency | Computational complexity and intensity | Scope and applicability | |||

Suggested improvement | DT models, cross-validation, data balancing, and feature selection techniques | Ensemble Learning | DT models, label encoding, and one hot encoding | Ensemble Learning | mRMR algorithm, hybrid (PSO & FURIA) | Linear models, DT, feature importance analysis or partial dependence plots, XAI techniques | TL, hyperparameter tuning, ML models | |||

Target variable | Mass of washed-off solids | Mass of retained solids/gully trapping efficiency | Mass of built-up solids | Debris flow volume | Filling rate of gully pots | Mass of washed-off solids | ||||

References | Sartor et al. (1974); Egodawatta et al. (2007) | Servat (1984); Bertrand-Krajewski et al. (1993) | Litwin & Donigian (1978), Bertrand-Krajewski et al. (1993) | Bujon (1988); Alley & Smith (1981) | Butler & Karunaratne (1995); Rietveld et al. (2020b) | Grottker (1990); Butler & Karunaratne (1995) | Sartor et al. (1974); Alley & Smith (1981) | Lee et al. (2021) | Post et al. (2015, 2016) | Driver & Troutman (1989) |

Model type . | Traditional . | Hybrid . | Data-driven . | |||||||
---|---|---|---|---|---|---|---|---|---|---|

Deterministic . | Conceptual . | Deterministic . | BRANN . | GLMM-AR . | Statistical . | |||||

Deposition phase . | SW . | SR . | SB . | SW . | SR . | SW . | ||||

Explanatory variables | Rainfall intensity, kinetic energy of rainfall and characteristics of solids | Catchment area, solids accumulation rate, and ADWP | Surface runoff and contributing area (land use) | Rainfall intensity | Flow rate, PSG, diameter of gully pot, diameter and depth of solids, and kinematic viscosity | Rainfall amount and flow rate | ACCU and DISP (dependent on land use, ADWP etc). | Variables of morphology, rainfall and geology | Road type, depth of trap, contributing surface area, catchment slope, position of outlet pipe, and presence of water seal | Catchment area and rainfall attributes |

Model constraint | Bias | Complex and non-linear variables | Inflexibility | Complex and non-linear variables | Explainability and transparency | Computational complexity and intensity | Scope and applicability | |||

Suggested improvement | DT models, cross-validation, data balancing, and feature selection techniques | Ensemble Learning | DT models, label encoding, and one hot encoding | Ensemble Learning | mRMR algorithm, hybrid (PSO & FURIA) | Linear models, DT, feature importance analysis or partial dependence plots, XAI techniques | TL, hyperparameter tuning, ML models | |||

Target variable | Mass of washed-off solids | Mass of retained solids/gully trapping efficiency | Mass of built-up solids | Debris flow volume | Filling rate of gully pots | Mass of washed-off solids | ||||

References | Sartor et al. (1974); Egodawatta et al. (2007) | Servat (1984); Bertrand-Krajewski et al. (1993) | Litwin & Donigian (1978), Bertrand-Krajewski et al. (1993) | Bujon (1988); Alley & Smith (1981) | Butler & Karunaratne (1995); Rietveld et al. (2020b) | Grottker (1990); Butler & Karunaratne (1995) | Sartor et al. (1974); Alley & Smith (1981) | Lee et al. (2021) | Post et al. (2015, 2016) | Driver & Troutman (1989) |

### The limitation of complex and non-linear variables

Deterministic and statistical models may not be applicable to all types of systems (Equation (1)) and may struggle with complex variables, for example poor understanding of the impact of ADWP in SB. However, feature selection techniques have been used within climate studies to identify the relative contribution and significance of explanatory variables in forecasting. Warton *et al.* (2015) utilised a residual correlation matrix to examine how 65 alpine tree species (explanatory variables) responded to snowmelt (the target variable). They evaluated the degree of correlation between the tree species and identified their significance and relevance across 75 different sites. Moreover, by utilising residual correlation matrix, they identified the environmental variables that were strongly correlated with the tree species data. This highlights the effectiveness of the approach in analysing complex community ecology data with multiple variables.

Haidar & Verma (2018) used a combination of genetic algorithm (GA) and particle swarm optimisation (PSO) algorithm (Kennedy & Eberhart 1995) to optimise climate features in rainfall forecasting. Their model outperformed three established standalone models while highlighting the effectiveness of hybrid models in selecting the most relevant climate variables and optimising the network parameters of a NN-based model.

Caraka *et al.* (2019) used the PSO algorithm which combines deterministic and heuristic techniques to identify the most relevant features for accurately predicting particulate matter 2.5 (PM_{2.5}), making it a useful tool for feature selection.

Hu *et al.* (2018) utilised minimum redundancy maximum relevance (mRMR) algorithm to identify the most significant features for local climate zone classification. The mRMR algorithm achieved high classification accuracy and outperformed other established feature selection techniques, such as principal component analysis (PCA) and correlation-based feature selection, due to its stochastic feature selection process, producing varied results in different algorithm runs. The utilisation of statistical measures to assess feature relevance and redundancy led to the selection of a feature subset that maximises relevance while minimising redundancy by Mazzanti (2021).

In developing a flash flood susceptibility model, Bui *et al.* (2019) used the FURIA-GA feature selection technique which is a combination of the fuzzy rule-based feature selection method and the GA, in selecting the most informative features for their flash flood susceptibility model. FURIA algorithm uses a DT to generate a set of fuzzy rules from the input data. Subsequently, the GA is utilised to search for the optimal subset of features (Bera 2020).

In identifying the main variables of solids deposition in gully pots, Rietveld *et al.* (2020a) utilised regression trees (RTs). Their study revealed that RTs provided slightly more accurate feature prediction when compared with LMMs, due to their capability to describe relationships between variables, under varying conditions. Lee *et al.* (2021) utilised Pearson's correlation analysis to identify the four most significant variables out of 15 that affect debris flow volume. The four prominent variables were then used to train the model.

It is important to acknowledge that the effectiveness of a chosen feature selection technique can be influenced by numerous variables present in a complex feature system. These variables may include high correlation, overfitting, large feature space leading to computational intensity, and imbalanced data (Cherrington *et al.* 2019). Therefore, it is crucial to determine an optimal feature selection method based on the characteristics of the data and the problem at hand. It is also equally important to validate the chosen features, to ensure their ability to generalise well to new data.

Equation (2) underlines the problem of inaccurate estimation of solid transport and trapping efficiency due to lack of consideration for turbulence and an assumption that solid particles are spherical. ML techniques are well-suited for handling various categories of shape and flow patterns (laminar, turbulent, steady, and unsteady) by converting the shapes and patterns into a numerical format that the algorithm can process. This process is known as label encoding (Table 3) (Scikit-learn Developers 2023). An example of label encoding could be assigning numerical values to recognised shape categories before developing a model (Table 3). This is exemplified in the Geng *et al.* (2022) study on predicting litterfall in forests by categorising forest types. Label encoding was used to convert categorical variables such as forest type, vegetation type, and climate zone into numeric variables to predict litterfall production.

Solids shape category . | Representative numerical value . |
---|---|

Spherical | 0 |

Angular | 1 |

Flaky | 2 |

Rod-like | 3 |

Discoid | 4 |

Ovoid | 5 |

Irregular | 6 |

Solids shape category . | Representative numerical value . |
---|---|

Spherical | 0 |

Angular | 1 |

Flaky | 2 |

Rod-like | 3 |

Discoid | 4 |

Ovoid | 5 |

Irregular | 6 |

To address the issue of not accounting for turbulence in Equation (2), one possible solution is to use one hot encoding to represent fluid flow as a binary categorical variable rather than use the conceptualised turbulence correction factor, . One hot encoding assigns a value of either 0 or 1 to indicate laminar or turbulent flow, respectively. Gong & Chen (2022) argues that one hot encoding is better as it avoids a misleading ranking between categories. However, in cases where a categorical variable has a natural order, such as rating the level of risk posed by solids in a gully (e.g., low, medium and high), label encoding may be a more suitable approach.

To prevent algorithmic complexity, a simplified version of Equations (6) and (7) is used to represent the NPS model. However, this simplified approach disregards non-linear relationships and fluctuations in hydrometeorological variables, which can result in model uncertainty. As a result, the model may require frequent recalibration for each specific application. Whilst deterministic and statistical models may not be easily adjusted for complex fluctuations in hydrometeorological variables (Litwin & Donigian 1978) and changes in a contributing area (Deletic *et al.* 1997), ensemble learning techniques such as Adaptative Boosting (Freund & Schapire 1995), GBM (Friedman 2001), and Stochastic Gradient Boosting combine several base models to produce one optimal predictive model and can easily learn complex relationships. Thus, they are well trained to reduce the need for frequent recalibration. Furthermore, the performance of the trained model can be evaluated and validated on a separate test dataset, using evaluation metrics such as MSE and mean absolute error (MAE), along with resampling techniques like cross-validation (Refaeilzadeh *et al.* 2009)**.**

In addition to the use of ensemble learning to resolve the recalibration issues identified in Equations (6) and (7) and the use of one hot encoding to deal with lack of consideration for turbulence in Equation (2), DT-based models such as RF (Breiman 2001) and RTs (Morgan & Sonquist 1963) can combine multiple trees for improved model performance and handle categorical variables without the need for one hot encoding (Gross 2020).

### The limitation of scope and applicability

Transfer Learning (TL) (Bozinovski & Fulgosi 1976) is a technique that allows NN to adapt from pre-trained models for new tasks or datasets. By leveraging the knowledge learned from a previous task, the model can improve its performance on a different problem, thus increasing its applicability and widening its scope. However, it is important to note that TL alone does not automatically select the best variables, algorithms, and hyperparameters for a given problem, as discussed by Yogatama & Mann (2014). To address this, hyperparameter tuning (Feurer & Hutter 2019), which involves selecting the optimal set of hyperparameters for a given model and project by learning from historical training data (Brownlee 2019) is necessary. Therefore, combining TL with hyperparameter tuning is crucial in dealing with scope and applicability issues. For example, the dataset used in generating the regression model for Equation (8) can be fine-tuned for applications in higher annual rainfall areas. Furthermore, the use of TL ensures the continual use of the existing model.

Subel *et al.* (2023) applied TL in sub-grid scale turbulence modelling by enhancing the capabilities of convolutional neural networks (CNNs), thus enabling them to extrapolate from one system to another. This was achieved by introducing a general framework that identifies the best re-training procedure for a given problem based on physics and NN theory. Hyperparameter tuning was then used to optimise the performance of the NN by searching over a specified hyperparameter space and finding the best layers to re-train. TL has also been used to improve the efficiency of distinct wastewater treatment processes. Pisa *et al.* (2023) used TL to develop a control system for wastewater treatment plants. Data from a source plant were used to train a deep NN and then the network was fine-tuned by data from the target plant. They evaluated the transfer suitability of the trained network by comparing its performance on the target plant with that of a network trained only on the target plant. Russo *et al.* (2023) used a combination of algorithms which includes RF, support vector regression, and ANN, to predict sediment and nutrient first flush. Their framework was used to identify the most influential variables that contribute to sediment and nutrient pollution in any geographical region, thus eliminating scope and applicability limitations.

### The limitation of inflexibility

*et al.*(2021)’s debris flow volume model. Their NN model outperformed various multiple linear regression models with fitting coefficients. Kim

*et al.*(2022) proposed a novel hybrid model for water quality forecasting. The methodology of their research involved the use of data decomposition, ML and error correction, which eliminates the reliance on fixed deterministic constants and identifies underlying patterns and trends in data. Furthermore, they built an error correction framework similar to Figure 6 by combining variational mode decomposition (VMD) algorithm (Dragomiretskiy & Zosso 2013) and Bidirectional Long Short-Term Memory (BiLSTM) NN (Schuster & Paliwal 1997), which in turn improved the forecast accuracy, by correcting errors in the data.

As shown in various studies, the hybrid model's ability to handle real-time data and correct errors makes it more accurate than depending on fixed deterministic constants (Li *et al.* 2021; Peng *et al.* 2022).

### The limitation of bias from the use of non-representative data, missing data, and outliers

Egodawatta *et al.* (2007) introduced Equation (5) as a modification to Sartor and Boyd's pollutant wash-off model (Equation (4)) to address the issue of biased and unreliable predictions, resulting from erroneous assumptions. This suggests that deterministic models may not always address bias by simply adding constraints to the model. However, when Equation (5) was subjected to a basic statistical evaluation using data from different sites, it became apparent that using non-representative build-up data could exacerbate bias in modelling.

Uncertainties with the modified model (Equation (5)) could be addressed by implementing advanced statistical and stochastic techniques that deal with outliers and high-dimensional or redundant data. These techniques can extract information from the data before training the model to effectively deal with bias. For example, in predicting nitrogen, phosphorus, and sediment mean concentrations in urban runoff, Behrouz *et al.* (2022) made use of RF, an algorithm known for its ability to handle noisy data and outliers. Lee *et al.* (2021) used cross-validation in the development of their debris flow volume model. Their study randomly partitioned the data associated with the four prominent variables into 10 subsets of approximately equal size, with a 7:3 ratio for training and validation datasets. They then trained their model on the training data and evaluated its performance on the validation data. This process was repeated 10 times, with each of the 10 subsets being used as the validation set once. The average performance of the model over the 10 iterations was then calculated to provide a more reliable estimate of its performance on unseen data. By using cross-validation, Lee *et al.* (2021) ensured that their model was valid and unbiased, and that it was not overfitting to a particular subset of data.

Most traditional statistical models such as linear regression, logistic regression, and analysis of variance (ANOVA) are known to be sensitive to outliers and biased towards certain groups within the variables. This implies that the choice of modelling technique can affect the accuracy and validity of the models. According to Maharana *et al.* (2022), the use of more representative and robust data preprocessing techniques can effectively address missing data, bias, and data quality issues in solid deposition modelling.

## ADDRESSING THE CONCERNS OF USING HYBRID MODELS IN GULLY POT SOLIDS DEPOSITION PREDICTION

Hybrid models that incorporate data-driven techniques have been recognised to handle uncertainties where traditional models may struggle (Post *et al.* 2015; Lee *et al.* 2021). However, these models require intricate and resource-intensive computation (Figure 4) and may be unexplainable due to their black box nature, posing the risk of model misapplication. As suggested in Section 4.1 and Table 4, the use of algorithms such as PDP, mRMR, PCA, FURIA-GA, RT has demonstrated effectiveness in model misapplication by selecting features during model development.

Method . | Algorithms . | Benefits . |
---|---|---|

Simplification | L1 or L2 regularisation (Ng 2004), sigmoid function (Cramer 2003), modified tanh (Simos & Tsitouras 2021), smooth function approximation (Shurman 2016; Ohn & Kim 2019), weight sharing (Pham et al. 2018) | Reduces complexity of the model and the number of hidden layers, simplifies the activation function, and prevents overfitting |

Layer-wise explanation | Gradient-weighted Class Activation Mapping, Grad-CAM (Selvaraju et al. 2016), Layer-wise Relevance Propagation, LRP (Bach et al. 2015), integrated gradients (Hsu & Li 2023) | Analyse the output of each layer in a NN to gain insights into the behaviour of the network. Thus, identifying importance of various layers |

Model-agnostic interpretation | Local Interpretable Model-Agnostic Explanations, LIME (Ribeiro et al. 2016), Shapley Addictive exPlanations, SHAP (Lundberg & Lee 2017), Recursive Feature Elimination, RFE (Guyon et al. 2002), Principal Component Analysis, PCA (Abdi & Williams 2010), mutual information (Shannon 1948), partial dependence plot, PDP (Friedman 1991), permutation feature importance, PFI (Breiman 2001) | Visual feature importance insights that explain the behaviour of complex models regardless of the model's architecture. Identify variables that are most important for producing a given output and provide insights into correlation between variables, and the behaviour and transparency of the model |

Tree-based explanation | Classification and RTs such as CART, ID3, C4.5, CHAID, MARS, RF, GBT (Loh 2008; Hannan & Anmala 2021) | Inherently interpretable ML model that can be used in conjunction with other XAI tools |

Method . | Algorithms . | Benefits . |
---|---|---|

Simplification | L1 or L2 regularisation (Ng 2004), sigmoid function (Cramer 2003), modified tanh (Simos & Tsitouras 2021), smooth function approximation (Shurman 2016; Ohn & Kim 2019), weight sharing (Pham et al. 2018) | Reduces complexity of the model and the number of hidden layers, simplifies the activation function, and prevents overfitting |

Layer-wise explanation | Gradient-weighted Class Activation Mapping, Grad-CAM (Selvaraju et al. 2016), Layer-wise Relevance Propagation, LRP (Bach et al. 2015), integrated gradients (Hsu & Li 2023) | Analyse the output of each layer in a NN to gain insights into the behaviour of the network. Thus, identifying importance of various layers |

Model-agnostic interpretation | Local Interpretable Model-Agnostic Explanations, LIME (Ribeiro et al. 2016), Shapley Addictive exPlanations, SHAP (Lundberg & Lee 2017), Recursive Feature Elimination, RFE (Guyon et al. 2002), Principal Component Analysis, PCA (Abdi & Williams 2010), mutual information (Shannon 1948), partial dependence plot, PDP (Friedman 1991), permutation feature importance, PFI (Breiman 2001) | Visual feature importance insights that explain the behaviour of complex models regardless of the model's architecture. Identify variables that are most important for producing a given output and provide insights into correlation between variables, and the behaviour and transparency of the model |

Tree-based explanation | Classification and RTs such as CART, ID3, C4.5, CHAID, MARS, RF, GBT (Loh 2008; Hannan & Anmala 2021) | Inherently interpretable ML model that can be used in conjunction with other XAI tools |

It is imperative to understand why and when stakeholders need insights from the ‘black box’ models that are used in predicting the performance of variables in solids deposition. These include the needs for informed stakeholder decision-making, directed future data collection planning, data troubleshooting, informed feature extraction and anomaly detection and embedding trust.

Data troubleshooting plays a crucial role due to the prevalence of ‘dirty’ data, potential errors in preprocessing code, and the risk of target leakage which occurs when the training data contains information about the target, but similar data will not be available during model prediction. This can adversely impact the overall performance of the model as shown in Post *et al.* (2016)’s robust outlier detection regime while developing their hybrid model for solids deposition in gully pots. Understanding the patterns identified by models allows for the identification and resolution of errors. Additionally, an understanding of model-based insights will enable feature extraction, which is achieved by creating new features from raw data or existing features. These insights become important when dealing with large datasets or lacking domain knowledge. By selecting or designing features that align with domain knowledge, the resulting model becomes more transparent and easier to explain to non-experts. This can be particularly important when the model's predictions impact critical decisions or require justification to gain trust from stakeholders. Lack of transparency in ‘black box’ models can pose a challenge in stakeholder decision-making, raise ethical concerns and possible discriminatory outcomes, potentially preventing specific groups from accessing opportunities. For example, a county that relies solely on data-driven systems to manage gully pot cleansing may disregard human contributions, potentially leading to a reduction in funds allocated to a gully jetting company responsible for routine and reactive cleansing of the county's gullies.

In the context of human decision-making, model insights hold significance as they can inform decisions made by individuals, sometimes surpassing the importance of predictions. There are also growing concerns about the autonomy of ML systems in their ability to take decisions and actions without inputs from human oversight, established deterministic theories, and conceptual thinking (Subías-Beltrán *et al.* 2022). These concerns underline the need for explainable models allowing humans to understand how they work and provide insights into the decisions made. This is important where decisions that are based on ML models can have consequences such as discriminatory outcomes. Furthermore, insights from models can guide future data collection efforts, helping local councils determine which types of data are most valuable for solid deposition management and investment.

*i*, an independent variable as shown in the following equation:where is the Shapely value for land use,

*f*is the black box model,

*x*is an input data point which is a single row in the gully inspection data, represents iteration over all possible subsets and combination of variables to ensure that interactions between individual variables are accounted for. If land use and solids type is one of the subsets under consideration, we can get the model output for this subset with ( and without the variable of interest (i.e. land use). The difference in and explains how land use contributed to the prediction in the subset.

Road hierarchy . | Solids type . | Season . | Rainfall intensity (mm/hour) . | Dry period (days) . | Landuse . | Solids level . |
---|---|---|---|---|---|---|

Service | Silt | Winter | 15.63 | 0 | Residential | 75% |

Lane | Leaves | Summer | 0.2 | 6 | Agricultural | 50% |

Service | Silt | Autumn | 9.3 | 4 | Residential | 50% |

Strategic | Leaves | Winter | 0.2 | 6 | Residential | 50% |

Minor | Silt | Spring | 15.63 | 0 | Recreational | 100% |

Road hierarchy . | Solids type . | Season . | Rainfall intensity (mm/hour) . | Dry period (days) . | Landuse . | Solids level . |
---|---|---|---|---|---|---|

Service | Silt | Winter | 15.63 | 0 | Residential | 75% |

Lane | Leaves | Summer | 0.2 | 6 | Agricultural | 50% |

Service | Silt | Autumn | 9.3 | 4 | Residential | 50% |

Strategic | Leaves | Winter | 0.2 | 6 | Residential | 50% |

Minor | Silt | Spring | 15.63 | 0 | Recreational | 100% |

For example, if the model output (solids level) with land use ( is 75%-filled and without land use (, 50%-filled, this then implies that land use contributes 25%, which is otherwise known as marginal value. The same process is repeated for each possible combination of subsets which are additionally weighted according to how many variables of the total number of variables () are in the subset.

*n*representing the number of variables. For example, the gully inspection data in Table 5 has six independent variables and 64 possible subset combinations, which makes it computationally intense to get the average contribution of one variable. According to Lundberg & Lee (2017), Kernel SHAP, which is an approximation technique that samples variable subsets and fits a linear regression based on the samples, can be used to eliminate the need for intense computation. Other approximation techniques are tree SHAP and deep SHAP which are used for tree-based and deep NN models, respectively. The SHAP summary plot (Figure 7) presents a concise and easily understandable overview of the model's feature importance (Lundberg & Lee 2017; SHAP 2018).

Geng *et al.* (2022) used SHAP values to demonstrate the importance and correlation of various explanatory variables in predicting litterfall production, a crucial solid build-up process. Similarly, Russo *et al.* (2023) used a combination of ‘black box’ algorithms which include RF, support vector regression, and ANN, to predict sediment and nutrient first flush. The study used 76 potential predictive variables as input to the machine learning algorithm. The SHAP algorithm was then used to determine the feature importance of the variables and to improve the interpretability and explainability of the ‘black box’ models.

Likewise, classification trees are simple and interpretable models that visually represent the decision-making process of a ‘black box’ model and explain how the model arrived at a specific prediction. Rietveld *et al.* (2020b) used RTs in explaining the significance and correlation between SB, wash-off, and retention predictors in predicting solid accumulation rate in gully pots.

## CONCLUSION AND FUTURE WORK

Traditional models have been used to estimate the deposition of solids in gully pots, but these methods have limitations. It has been demonstrated that explainable hybrid models can lessen the effects of these limitations.

This study offers a promising approach to overcome the limitations of traditional models in simulating complex systems such as SB, wash-off, and retention processes in gully pots. By integrating traditional and data-driven models, hybrids are produced to handle complex and non-linear variables, improve the scope and applicability of existing models, increase their flexibility, and reduce bias from non-representative data, missing data, and outliers. However the resource-intensive computation requirements and lack of explainability of hybrid models can lead to misapplication and flawed decision-making. There is a need for resource-efficient and explainable hybrid models that allow stakeholders to understand how a model works and why it takes certain decisions. SHAP values, DT, and other explainable Artificial Intelligence (XAI) tools can enhance the interpretability and explainability of ‘black box’ models, enabling stakeholders to make informed decisions based on reliable insights. By adopting these XAI tools, we can mitigate the risks associated with hybrid models and ensure that they are transparent, ethical, and beneficial. As explainable hybrids evolve, they will become an increasingly valuable tool for addressing complex modelling challenges in solids deposition on road surfaces and in urban stormwater management.

Future works will utilise explainable hybrid architecture to improve the predictive accuracy of solids deposition using gully inspection data from multiple local authorities.

## AUTHORS’ CONTRIBUTIONS

C.F.E. contributed to conceptualisation, methodology, and writing. A.C. contributed to writing, review, and supervision. H.B. contributed to review and supervision. E.E. and C.S. reviewed the article.

## FUNDING

There was no external funding for this research.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Feature Selection using Genetic Algorithm*. Available from: https://medium.com/analytics-vidhya/feature-selection-using-genetic-algorithm-20078be41d16 (Accessed 10 April 2023)

**29**(5),

*Tree-Based Models: How They Work (In Plain English!)*Available from: https://blog.dataiku.com/tree-based-models-how-they-work-in-plain-english (Accessed 11 April 2023)

*Proceedings of the 31st Annual Conference on Advances in Neural Information Processing Systems*, California, United States of America, 4–9 December, pp. 1–10.

_{1}vs. L

_{2}Regularization, and Rotational Invariance

*Proceedings of the 10th International Conference on Urban drainage modelling*, Quebec, Canada, 20–23 September, pp. 59–61.

*Contribution à L'étude des Matières en Suspension du Ruissellement Pluvial à L'échelle D'un Petit Bassin Versant Urbain (Contribution to the Study of Suspended Matter in Stormwater Runoff at the Scale of A Small Urban Watershed)*

*PhD Thesis*

*Highways Asset Management Framework 2015–2020*. Available from: https://www.southglos.gov.uk/documents/Highways-Asset-Management-Framework2015-2020.pdf (Accessed 20 September 2022)

*Drainage Data FOI Ref FIDP/017*(Accessed 25 May 2022)