ABSTRACT
Urban flooding has made it necessary to gain a better understanding of how well gully pots perform when overwhelmed by solids deposition due to various climatic and anthropogenic variables. This study investigates solids deposition in gully pots through the review of eight models, comprising four deterministic models, two hybrid models, a statistical model, and a conceptual model, representing a wide spectrum of solid depositional processes. Traditional models understand and manage the impact of climatic and anthropogenic variables on solid deposition but they are prone to uncertainties due to inadequate handling of complex and non-linear variables, restricted applicability, inflexibility and data bias. Hybrid models which integrate traditional models with data-driven approaches have proved to improve predictions and guarantee the development of uncertainty-proof models. Despite their effectiveness, hybrid models lack explainability. Hence, this study presents the significance of eXplainable Artificial Intelligence (XAI) tools in addressing the challenges associated with hybrid models. Finally, crossovers between various models and a representative workflow for the approach to solids deposition modelling in gully pots is suggested. The paper concludes that the application of explainable hybrid modeling can serve as a valuable tool for gully pot management as it can address key limitations present in existing models.
HIGHLIGHTS
Existing models are presented and discussed.
Integrating data-driven and traditional models enhances performance.
Explainability could pose a challenge to adopting hybrids.
A review study is conducted on crossovers between different models to explore their limitations and propose potential improvement.
A workflow is developed to address the challenges associated with the implementation of explainable hybrids for prediction.
INTRODUCTION
Gully pot with an integral solids trap (all dimensions are in millimeters).
Surface runoff and underground drainage networks are connected by these gully pots. They are designed to minimise solids deposition in drainage systems which contribute to blockages in the drainage network, reduce sewer system efficiency, urban flooding and increased pollution into water bodies (Forty 1998; British Standards Institution 2021). Trapped gully pots have a solids collector, sometimes known as a solids trap. Solids traps capture sediments that could otherwise escape through the grate. The deposition of solids in road gullies is influenced by climatic- and anthropogenic-driven processes, which can be categorised into three phases: Solids Build-Up (SB), Solids Wash-Off (SW), and Solids Retention (SR) (Rietveld et al. 2020b). SB processes are mostly time-dependent and include variables in the contributing area such as: traffic intensity, road surface type, solids particle size, and street sweeping frequency. SW processes are mostly climate-dependent and include: rainfall amount and intensity, surface runoff, wind action and temperature. SR processes directly impact the accumulation of solids in gully pots including: variables of flow rate, gully cross-section, depth of solids trap, solids type and gully filling degree or position of outlet pipes. It is important to assess techniques for predicting the interaction between these processes to better understand their potential effects on the deposition of solids in gully pots (Rietveld et al. 2020b).
A county's gully inspection data showing solids deposition levels (South Gloucestershire Council 2022).
A county's gully inspection data showing solids deposition levels (South Gloucestershire Council 2022).
To find an optimised approach to the deposition of solids in gully pots, conceptual, deterministic, statistical, and hybrid models need to be further explored to determine the differences and overlap (Obropta & Kardos 2007). For example, a deterministic model may include some stochastic elements to account for uncertainty or variability in the system. Similarly, a statistical model may use deterministic equations to model the relationship between variables. Since the boundaries between models are not always clear, it is important to understand the strengths and limitations of these models. By understanding the crossovers between them, they can be integrated to produce explainable hybrids that offer improved prediction accuracy. Following the above discussion, the objectives of this work are as follows:
Highlight the strengths and limitations of existing models that predict solids deposition processes in gully pots.
Explore the fusion of data-driven and traditional models to develop hybrid models that enhance the overall performance.
Consider the challenges involved in deploying hybrid models for solid deposition prediction.
Develop a clear workflow for explainable hybrid modelling.
The remaining sections of this study are organised as follows: Section 2 describes solids deposition procedures in gully pots. Section 3 presents a review of existing models for the prediction of solids deposition. The limitations in utilising existing models are discussed in Section 4 along with how explainable hybrid models can be used to enhance them. Finally, Section 5 explores ways to ensure that the hybrid models used for prediction are explainable to all stakeholders.
DEPOSITION OF SOLIDS IN GULLY POTS – DESCRIPTION OF PROCESSES
The amount of deposited solids in gully pots is influenced by climatic and anthropogenic processes. These processes are divided into three major phases as described below.
SB processes within a defined catchment area give rise to accumulation of debris, sediments and other particles that are eventually transported by wash-off processes into gully pots. Some of these time-dependent processes include characteristics of the contributing area such as leaf fall (Nix 2002), road slope (Muthusamy et al. 2018), traffic intensity (Chow et al. 2015), road surface roughness (Zhao et al. 2018), particle size distribution, mass and density (Xiao et al. 2022), and street sweeping frequency (Egodawatta et al. 2013).
SW processes are those by which solids are transported into gully pots. They are mostly climate-dependent and include contributing area, rainfall characteristics (Sartor et al. 1974; Egodawatta et al. 2007), surface runoff (Zhao et al. 2018), wind action (Butler & Karunaratne 1995), daily sunshine hours and solar radiation (Nix 2002), temperature (Post et al. 2016), antecedent dry weather period (ADWP) (Post et al. 2016; Rietveld et al. 2020b), solids geometry, which include initial sediment load and particle size distribution, mass and density (Grottker & Hurlebush 1987; Butler & Karunaratne 1995) and street sweeping frequency.
SR processes that directly impact accumulation of solids in gully pots are the result of their design geometry (Post et al. 2016) and flow rate (Deletic et al. 1997). These processes have the potential to reduce a gully's hydraulic capacity at any given time and can impact on a gully's trapping efficiency. SR processes include contributing area, rainfall characteristics, gully cross-sectional area and depth of solid trap or gully pot (Post et al. 2016), gully grate design pattern (Rietveld et al. 2020a), solids geometry and type, and gully filling degree/position of outlet pipes (Post et al. 2016; Rietveld et al. 2020b).
SB over a catchment is washed-off by rainfall and other sediment-transport processes. It is then transferred through gully pots into sewers although due to their trapping efficiency, gully pots can capture and retain these pollutants through retention variables implying that, SB, SW, and SR processes are interrelated by a range of overlapping variables. Contributing area is an example variable relevant to all processes, while specific variables may be important to individual processes.
REVIEW OF THE EXISTING MODELLING TECHNIQUES FOR THE PREDICTION OF SOLIDS DEPOSITION
To understand and manage the impact of climatic and anthropogenic changes on SB, SW, and SR processes, several models have been developed. However, certain models lack the resilience required to handle the uncertainty that arises from limitations, such as scope and applicability (Litwin & Donigian 1978; Driver & Troutman 1989), the use of complex and non-linear variables (Bertrand-Krajewski et al. 1993), inflexibility due to reliance on fixed constants and processes (Grottker & Hurlebush 1987), bias from the use of limited and erroneous variables (Sartor et al. 1974; Egodawatta et al. 2007), sensitivity to outliers, precision and data mismatch (Rietveld et al. 2020b).
Although some models are developed to recognise patterns in non-linear and complex problems and handle uncertainty, they may not be explainable (Lundberg & Lee 2017; Geng et al. 2022), thereby leading to the risk of misapplication (Almutairi et al. 2021). Models such as bootstrap aggregating and Adaptive Boosting (AdB) (Behrouz et al. 2022), generally referred to as ensemble learning, can explicitly integrate deterministic, statistical and stochastic models and potentially exploit the advantages of each approach to reduce prediction error and uncertainty. Other useful hybrid models in this regard include Artificial Neural Networks (ANNs), Random Forests (RF) (Breiman 2001), Gradient Boosting Machines (GBM) (Friedman 2001), and Monte Carlo simulations. Nevertheless, these hybrid models are not explainable and may require complex and resource-intensive computations (Clark 2005; Gelman & Hill 2006; Post et al. 2015; Lee et al. 2021). XAI involves developing a model that humans can explain. It is important for ensuring trust, accountability and transparency in the model.
This study reviewed eight models applied in literature for the study of solids deposition in gully pots and beyond. The reviewed models are first summarised in Table 1. These models are subsequently discussed in Sections 3.1–3.3.
A summary of the existing modelling techniques used for the prediction of solids deposition
s/n . | Model . | Type . | Deposition phase . |
---|---|---|---|
1 | Stormwater Management Model (SWMM) (Sartor et al. 1974; Alley & Smith 1981) | Deterministic | SB |
2 | Butler and Karunaratne's gully pot trapping efficiency model (Butler & Karunaratne 1995) | Deterministic | SR |
3 | Grottker's soilds retention model (Grottker & Hurlebush 1987; Grottker 1990; Butler & Karunaratne 1995) | Deterministic | SR |
4 | Modified Sartor and Boyd's model (Sartor et al. 1974; Egodawatta et al. 2007) | Deterministic and statistical | SW |
5 | The Non-Point Source model (NPS) (Litwin & Donigian 1978) | Conceptual | SW |
6 | Driver and Troutman's model (Driver & Troutman 1989) | Statistical | SW |
7 | Debris flow volume model (Lee et al. 2021) | Hybrid | SW |
8 | Gully pot sediment accumulation model (Post et al. 2015, 2016) | Hybrid | SR |
s/n . | Model . | Type . | Deposition phase . |
---|---|---|---|
1 | Stormwater Management Model (SWMM) (Sartor et al. 1974; Alley & Smith 1981) | Deterministic | SB |
2 | Butler and Karunaratne's gully pot trapping efficiency model (Butler & Karunaratne 1995) | Deterministic | SR |
3 | Grottker's soilds retention model (Grottker & Hurlebush 1987; Grottker 1990; Butler & Karunaratne 1995) | Deterministic | SR |
4 | Modified Sartor and Boyd's model (Sartor et al. 1974; Egodawatta et al. 2007) | Deterministic and statistical | SW |
5 | The Non-Point Source model (NPS) (Litwin & Donigian 1978) | Conceptual | SW |
6 | Driver and Troutman's model (Driver & Troutman 1989) | Statistical | SW |
7 | Debris flow volume model (Lee et al. 2021) | Hybrid | SW |
8 | Gully pot sediment accumulation model (Post et al. 2015, 2016) | Hybrid | SR |
Deterministic modelling techniques
Deterministic models are based on mathematical equations. Therefore, they may not be applicable to all types of processes, for example the complex processes involved in solids deposition (Bertrand-Krajewski et al. 1993). These models struggle with complex variables, since they are limited in the context of scope and applicability, inflexibility, data limitations, bias, and sensitivity to outliers. These limitations further aggravate uncertainty, which are often calibrated by trial and error (Alley & Smith 1981), with little understanding of the models' sensitivity to the variables driving solid deposition. Deletic et al. (1997) and Rietveld et al. (2020b) further suggested that deterministic models may not adequately handle missing data or measurement errors, leading to inaccurate predictions or inferences. Furthermore, Bertrand-Krajewski et al. (1993) stated that the precision of a deterministic model relies on how well the calculated values agree with the observed values. This agreement can be measured through objective functions such as the mean square error (MSE) and least square method (LSM) (Egodawatta et al. 2007). Nonetheless, this precision accounts for discrepancies such as measurement errors in sampling and the assumption that the deterministic model is only a rough approximation of the complex physical processes.





















Conceptual modelling techniques



NPS model development was generalised to consider non-point pollutants from a maximum of five land use categories which include urban, agricultural, forested, and construction areas, whilst interfacing with water quality parameters of temperature, dissolved oxygen, suspended solids, and biochemical oxygen demand (Litwin & Donigian 1978). The model also considers seasonal variables stemming from construction activities, gritting, and leaf fall. The NPS model aids in estimating the solids transported by runoff and deposited into gully pots. This helps assess how well gully pots perform in reducing solids deposition in drainage systems and minimising pollution in receiving water bodies (Post et al. 2016). However, the mathematical representation of solids deposition and wash-off require rigorous and separate simulation and evaluation. In the absence of sufficient data, this introduces algorithm complexity and to address this, a simplified representation of processes controlling non-point pollution must be established. This raises the dilemma of trading algorithm complexity for reduced uncertainty and simplicity of application.
Bertrand-Krajewski et al. (1993) further argued that the simplified representation of processes used in the NPS Model (Equations (6) and (7)) may not accurately reflect the reality of SW and pollutant transport. This is due to its failure to account for the temporal fluctuations in precipitation, runoff, and pollutant concentrations commonly observed in hydrometeorological variables. As a result, the NPS model must be meticulously calibrated whenever it is applied to a new watershed which is a time-consuming and complex process (Yuan et al. 2020).
Hydrid modelling techniques



According to Kunin et al. (2019) and Burden & Winkler (2008), Bayesian regularisation is a mathematical process that converts a non-linear regression into a statistical problem. Using a Bayesian Regularised Artificial Neural Network (BRANN) model, Lee et al. (2021) conducted an analysis of solids deposition prediction accuracy using historical extreme rainfall events and 15 climatic and anthropogenic solids variables, i.e. a hybrid model. They found that BRANN had a higher accuracy with a coefficient of determination (R2) of 0.911, compared to the Multiple Linear Regression Equations (MLRE) of Marchi & D'Agostino (2004) and Chang et al. (2011), which had R2 values of 0.693, 0.688, and 0.670. Although Lee et al. (2021)'s research is not unique to solids deposition prediction in gully pots. BRANN model has been applied in natural gas explosion risk analysis (Shi et al. 2019), rainfall prediction model for debris flow (Zhao et al. 2022) and optimisation of diesel engine combustion events (Ankobea-Ansah & Hall 2022). BRANN combines the deterministic Bayesian Regularisation Algorithm (BRA) with the stochastic ANN and is known to improve the prediction accuracy of complex systems, incorporate stochasticity, handle uncertainty, and capture data variability (Papananias et al. 2017).





Apart from its ‘black box’ nature, complexity (Uzair & Jamil 2020), weight initialisation (Manish Agrawal et al. 2021), and use of activation functions (Uzair & Jamil 2020; Brownlee 2021) are some of the reasons why the hidden layer tends to be less explainable. According to Uzair & Jamil (2020), hidden layers are designed to perform non-linear transformations of the inputs entered into the network. These transformations can become increasingly complex as more hidden layers are added to the network.







Schematic of an artificial neuron with inputs (x1,2…n), weights (w1,2,…n), bias (b), transfer function (), activation function (
) and output (y).
Schematic of an artificial neuron with inputs (x1,2…n), weights (w1,2,…n), bias (b), transfer function (), activation function (
) and output (y).
The weights () can be randomly assigned, fine-tuned and calibrated through the process of back propagation (Ognjanovski 2019). In any case, the weights can be difficult to interpret and understand (Manish Agrawal et al. 2021). Furthermore, the neurons in the hidden layer take in a set of weighted input and produce an output through an activation function, whose choice can have a significant impact on the behaviour of the hidden layer. Some activation functions such as Rectified Linear Unit (ReLU) and Softmax can be more difficult to interpret than others (Agarap 2018) and misleading (Ozbulak et al. 2018). Simos & Tsitouras (2021) proposed a modification to the commonly used non-linear activation function, Hyperbolic Tangent (tanh), with the aim of reducing the computational complexity of neural networks (NNs).














From Equation (11), the target variable y is a linear combination of the fixed-effects part (), the random effects part (
), and the residuals (
). The equation assumes a linear model with fixed and random effects, and the residuals
are assumed to be independent and identically distributed with mean zero. The GLMM thus incorporates both deterministic (
and stochastic (
) elements (Penn State University n.d.).
To clarify further, is the residual or the random error component of the model. It captures the variability in the target variable y that is not explained by the fixed-effects predictor variables X and the random effect variables Z. The residual
introduces the stochastic or random element in the equation, accounting for the unexplained variability in the data (uncertainty), which can arise from measurement bias or unobserved variables. To improve the predictive performance and uncertainty quantification of the GLMM, an autoregressive (AR) component within a Bayesian framework (a stochastic process) is incorporated. This is achieved by using priors and posterior distributions as shown in Equations (12) and (13).








Given the observed data , ….,
, some Bayesian inference techniques, such as Gibbs sampling (Casella & George 1992) and Metropolis-Hastings algorithms (Chib & Greenberg 1995) that are based on Markov Chain Monte Carlo (MCMC) methodology can be used to obtain posterior distributions for the parameters. These posterior distributions provide information about the uncertainty in the estimates. Thus, AR models a time series as a linear combination of its previous values. By acknowledging uncertainty about the model parameters, this approach enables probabilistic predictions about future values of the time series (Martin et al. 2021).
The PyMC3 (Salvatier et al. 2016) is a probabilistic model which can be used to define the GLMM with a binary response variable and an AR component, and then sample from the posterior distribution using MCMC methods. For multiclass response variables, a multinomial logistic regression model with an AR component could be deployed (Chan 2023).
Post et al. (2015) combined the GLMM with an AR component from a Bayesian perspective. Their objective was to examine the impact of geometrical and catchment variables (c.f.Post et al. 2015) on the filling rates of gully pots, based on monthly measurements of solid bed levels from 300 gully pots for one year. Their results provided insights into the effect of different designs on accumulation in gully pots, allowing for better optimisation of maintenance activities and improved gully pot design.
Post et al. (2015)’s Bayesian approach over the quasi-likelihood technique may not accurately represent the true underlying distribution of the data, as it can be prone to inaccuracies, is sensitive to outliers and not well-suited for modelling non-linear relationships between variables (Spiegelhalter et al. 2002). Therefore, by utilising a combination of GLMM and AR from a Bayesian perspective, their model development was able to effectively capture the complex time series data that exhibited both temporal autocorrelation and dependence or clustering of observations.
However, Clark (2005) and Gelman & Hill (2006) suggested that the combination of GLMM and AR from a Bayesian perspective can lead to increased computational complexity and intensity. Also, the model results interpretation can be more challenging for some researchers and stakeholders not familiar with Bayesian statistics.
ADDRESSING THE LIMITATIONS IN THE PREDICTION OF SOLIDS DEPOSITION THROUGH THE USE OF HYBRID MODELS
Conceptual and deterministic models rely on physically based equations and linear models to demonstrate SB over a catchment, SW and passage through gully pots where the solids are either deposited or transferred to sewers. However, these models are limited due to uncertainties arising from scope and applicability, precision, inflexibility, data limitations, bias, and sensitivity to outliers. Statistical models, which assume normality and linearity often rely on linear regression, also face similar limitations when dealing with complex relationships or non-linear patterns in data, as reported by Marchi & D'Agostino (2004), Chang et al. (2011), and Lee et al. (2021). Although hybrid models have been employed to handle uncertainties, they create new issues such as risk of misapplication, computational complexity and intensity, poor model interpretability and explainability. Therefore, the use and benefits of XAI tools will be discussed in this section to address the limitations of these hybrid models. Table 2 presents a summary of crossovers between models and how they may be addressed.
Crossovers between solids deposition models
Model type . | Traditional . | Hybrid . | Data-driven . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Deterministic . | Conceptual . | Deterministic . | BRANN . | GLMM-AR . | Statistical . | |||||
Deposition phase . | SW . | SR . | SB . | SW . | SR . | SW . | ||||
Explanatory variables | Rainfall intensity, kinetic energy of rainfall and characteristics of solids | Catchment area, solids accumulation rate, and ADWP | Surface runoff and contributing area (land use) | Rainfall intensity | Flow rate, PSG, diameter of gully pot, diameter and depth of solids, and kinematic viscosity | Rainfall amount and flow rate | ACCU and DISP (dependent on land use, ADWP etc). | Variables of morphology, rainfall and geology | Road type, depth of trap, contributing surface area, catchment slope, position of outlet pipe, and presence of water seal | Catchment area and rainfall attributes |
Model constraint | Bias | Complex and non-linear variables | Inflexibility | Complex and non-linear variables | Explainability and transparency | Computational complexity and intensity | Scope and applicability | |||
Suggested improvement | DT models, cross-validation, data balancing, and feature selection techniques | Ensemble Learning | DT models, label encoding, and one hot encoding | Ensemble Learning | mRMR algorithm, hybrid (PSO & FURIA) | Linear models, DT, feature importance analysis or partial dependence plots, XAI techniques | TL, hyperparameter tuning, ML models | |||
Target variable | Mass of washed-off solids | Mass of retained solids/gully trapping efficiency | Mass of built-up solids | Debris flow volume | Filling rate of gully pots | Mass of washed-off solids | ||||
References | Sartor et al. (1974); Egodawatta et al. (2007) | Servat (1984); Bertrand-Krajewski et al. (1993) | Litwin & Donigian (1978), Bertrand-Krajewski et al. (1993) | Bujon (1988); Alley & Smith (1981) | Butler & Karunaratne (1995); Rietveld et al. (2020b) | Grottker (1990); Butler & Karunaratne (1995) | Sartor et al. (1974); Alley & Smith (1981) | Lee et al. (2021) | Post et al. (2015, 2016) | Driver & Troutman (1989) |
Model type . | Traditional . | Hybrid . | Data-driven . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Deterministic . | Conceptual . | Deterministic . | BRANN . | GLMM-AR . | Statistical . | |||||
Deposition phase . | SW . | SR . | SB . | SW . | SR . | SW . | ||||
Explanatory variables | Rainfall intensity, kinetic energy of rainfall and characteristics of solids | Catchment area, solids accumulation rate, and ADWP | Surface runoff and contributing area (land use) | Rainfall intensity | Flow rate, PSG, diameter of gully pot, diameter and depth of solids, and kinematic viscosity | Rainfall amount and flow rate | ACCU and DISP (dependent on land use, ADWP etc). | Variables of morphology, rainfall and geology | Road type, depth of trap, contributing surface area, catchment slope, position of outlet pipe, and presence of water seal | Catchment area and rainfall attributes |
Model constraint | Bias | Complex and non-linear variables | Inflexibility | Complex and non-linear variables | Explainability and transparency | Computational complexity and intensity | Scope and applicability | |||
Suggested improvement | DT models, cross-validation, data balancing, and feature selection techniques | Ensemble Learning | DT models, label encoding, and one hot encoding | Ensemble Learning | mRMR algorithm, hybrid (PSO & FURIA) | Linear models, DT, feature importance analysis or partial dependence plots, XAI techniques | TL, hyperparameter tuning, ML models | |||
Target variable | Mass of washed-off solids | Mass of retained solids/gully trapping efficiency | Mass of built-up solids | Debris flow volume | Filling rate of gully pots | Mass of washed-off solids | ||||
References | Sartor et al. (1974); Egodawatta et al. (2007) | Servat (1984); Bertrand-Krajewski et al. (1993) | Litwin & Donigian (1978), Bertrand-Krajewski et al. (1993) | Bujon (1988); Alley & Smith (1981) | Butler & Karunaratne (1995); Rietveld et al. (2020b) | Grottker (1990); Butler & Karunaratne (1995) | Sartor et al. (1974); Alley & Smith (1981) | Lee et al. (2021) | Post et al. (2015, 2016) | Driver & Troutman (1989) |
The limitation of complex and non-linear variables
Deterministic and statistical models may not be applicable to all types of systems (Equation (1)) and may struggle with complex variables, for example poor understanding of the impact of ADWP in SB. However, feature selection techniques have been used within climate studies to identify the relative contribution and significance of explanatory variables in forecasting. Warton et al. (2015) utilised a residual correlation matrix to examine how 65 alpine tree species (explanatory variables) responded to snowmelt (the target variable). They evaluated the degree of correlation between the tree species and identified their significance and relevance across 75 different sites. Moreover, by utilising residual correlation matrix, they identified the environmental variables that were strongly correlated with the tree species data. This highlights the effectiveness of the approach in analysing complex community ecology data with multiple variables.
Haidar & Verma (2018) used a combination of genetic algorithm (GA) and particle swarm optimisation (PSO) algorithm (Kennedy & Eberhart 1995) to optimise climate features in rainfall forecasting. Their model outperformed three established standalone models while highlighting the effectiveness of hybrid models in selecting the most relevant climate variables and optimising the network parameters of a NN-based model.
Caraka et al. (2019) used the PSO algorithm which combines deterministic and heuristic techniques to identify the most relevant features for accurately predicting particulate matter 2.5 (PM2.5), making it a useful tool for feature selection.
Hu et al. (2018) utilised minimum redundancy maximum relevance (mRMR) algorithm to identify the most significant features for local climate zone classification. The mRMR algorithm achieved high classification accuracy and outperformed other established feature selection techniques, such as principal component analysis (PCA) and correlation-based feature selection, due to its stochastic feature selection process, producing varied results in different algorithm runs. The utilisation of statistical measures to assess feature relevance and redundancy led to the selection of a feature subset that maximises relevance while minimising redundancy by Mazzanti (2021).
In developing a flash flood susceptibility model, Bui et al. (2019) used the FURIA-GA feature selection technique which is a combination of the fuzzy rule-based feature selection method and the GA, in selecting the most informative features for their flash flood susceptibility model. FURIA algorithm uses a DT to generate a set of fuzzy rules from the input data. Subsequently, the GA is utilised to search for the optimal subset of features (Bera 2020).
In identifying the main variables of solids deposition in gully pots, Rietveld et al. (2020a) utilised regression trees (RTs). Their study revealed that RTs provided slightly more accurate feature prediction when compared with LMMs, due to their capability to describe relationships between variables, under varying conditions. Lee et al. (2021) utilised Pearson's correlation analysis to identify the four most significant variables out of 15 that affect debris flow volume. The four prominent variables were then used to train the model.
It is important to acknowledge that the effectiveness of a chosen feature selection technique can be influenced by numerous variables present in a complex feature system. These variables may include high correlation, overfitting, large feature space leading to computational intensity, and imbalanced data (Cherrington et al. 2019). Therefore, it is crucial to determine an optimal feature selection method based on the characteristics of the data and the problem at hand. It is also equally important to validate the chosen features, to ensure their ability to generalise well to new data.
Equation (2) underlines the problem of inaccurate estimation of solid transport and trapping efficiency due to lack of consideration for turbulence and an assumption that solid particles are spherical. ML techniques are well-suited for handling various categories of shape and flow patterns (laminar, turbulent, steady, and unsteady) by converting the shapes and patterns into a numerical format that the algorithm can process. This process is known as label encoding (Table 3) (Scikit-learn Developers 2023). An example of label encoding could be assigning numerical values to recognised shape categories before developing a model (Table 3). This is exemplified in the Geng et al. (2022) study on predicting litterfall in forests by categorising forest types. Label encoding was used to convert categorical variables such as forest type, vegetation type, and climate zone into numeric variables to predict litterfall production.
The use of label encoding to illustrate solids shape, as a categorical explanatory variable in soilds retention prediction
Solids shape category . | Representative numerical value . |
---|---|
Spherical | 0 |
Angular | 1 |
Flaky | 2 |
Rod-like | 3 |
Discoid | 4 |
Ovoid | 5 |
Irregular | 6 |
Solids shape category . | Representative numerical value . |
---|---|
Spherical | 0 |
Angular | 1 |
Flaky | 2 |
Rod-like | 3 |
Discoid | 4 |
Ovoid | 5 |
Irregular | 6 |
To address the issue of not accounting for turbulence in Equation (2), one possible solution is to use one hot encoding to represent fluid flow as a binary categorical variable rather than use the conceptualised turbulence correction factor, . One hot encoding assigns a value of either 0 or 1 to indicate laminar or turbulent flow, respectively. Gong & Chen (2022) argues that one hot encoding is better as it avoids a misleading ranking between categories. However, in cases where a categorical variable has a natural order, such as rating the level of risk posed by solids in a gully (e.g., low, medium and high), label encoding may be a more suitable approach.
To prevent algorithmic complexity, a simplified version of Equations (6) and (7) is used to represent the NPS model. However, this simplified approach disregards non-linear relationships and fluctuations in hydrometeorological variables, which can result in model uncertainty. As a result, the model may require frequent recalibration for each specific application. Whilst deterministic and statistical models may not be easily adjusted for complex fluctuations in hydrometeorological variables (Litwin & Donigian 1978) and changes in a contributing area (Deletic et al. 1997), ensemble learning techniques such as Adaptative Boosting (Freund & Schapire 1995), GBM (Friedman 2001), and Stochastic Gradient Boosting combine several base models to produce one optimal predictive model and can easily learn complex relationships. Thus, they are well trained to reduce the need for frequent recalibration. Furthermore, the performance of the trained model can be evaluated and validated on a separate test dataset, using evaluation metrics such as MSE and mean absolute error (MAE), along with resampling techniques like cross-validation (Refaeilzadeh et al. 2009).
In addition to the use of ensemble learning to resolve the recalibration issues identified in Equations (6) and (7) and the use of one hot encoding to deal with lack of consideration for turbulence in Equation (2), DT-based models such as RF (Breiman 2001) and RTs (Morgan & Sonquist 1963) can combine multiple trees for improved model performance and handle categorical variables without the need for one hot encoding (Gross 2020).
The limitation of scope and applicability
Transfer Learning (TL) (Bozinovski & Fulgosi 1976) is a technique that allows NN to adapt from pre-trained models for new tasks or datasets. By leveraging the knowledge learned from a previous task, the model can improve its performance on a different problem, thus increasing its applicability and widening its scope. However, it is important to note that TL alone does not automatically select the best variables, algorithms, and hyperparameters for a given problem, as discussed by Yogatama & Mann (2014). To address this, hyperparameter tuning (Feurer & Hutter 2019), which involves selecting the optimal set of hyperparameters for a given model and project by learning from historical training data (Brownlee 2019) is necessary. Therefore, combining TL with hyperparameter tuning is crucial in dealing with scope and applicability issues. For example, the dataset used in generating the regression model for Equation (8) can be fine-tuned for applications in higher annual rainfall areas. Furthermore, the use of TL ensures the continual use of the existing model.
Subel et al. (2023) applied TL in sub-grid scale turbulence modelling by enhancing the capabilities of convolutional neural networks (CNNs), thus enabling them to extrapolate from one system to another. This was achieved by introducing a general framework that identifies the best re-training procedure for a given problem based on physics and NN theory. Hyperparameter tuning was then used to optimise the performance of the NN by searching over a specified hyperparameter space and finding the best layers to re-train. TL has also been used to improve the efficiency of distinct wastewater treatment processes. Pisa et al. (2023) used TL to develop a control system for wastewater treatment plants. Data from a source plant were used to train a deep NN and then the network was fine-tuned by data from the target plant. They evaluated the transfer suitability of the trained network by comparing its performance on the target plant with that of a network trained only on the target plant. Russo et al. (2023) used a combination of algorithms which includes RF, support vector regression, and ANN, to predict sediment and nutrient first flush. Their framework was used to identify the most influential variables that contribute to sediment and nutrient pollution in any geographical region, thus eliminating scope and applicability limitations.
The limitation of inflexibility
A VMD and BiLSTM-based error correction flowchart (modified from Kim et al. 2022).
A VMD and BiLSTM-based error correction flowchart (modified from Kim et al. 2022).
As shown in various studies, the hybrid model's ability to handle real-time data and correct errors makes it more accurate than depending on fixed deterministic constants (Li et al. 2021; Peng et al. 2022).
The limitation of bias from the use of non-representative data, missing data, and outliers
Egodawatta et al. (2007) introduced Equation (5) as a modification to Sartor and Boyd's pollutant wash-off model (Equation (4)) to address the issue of biased and unreliable predictions, resulting from erroneous assumptions. This suggests that deterministic models may not always address bias by simply adding constraints to the model. However, when Equation (5) was subjected to a basic statistical evaluation using data from different sites, it became apparent that using non-representative build-up data could exacerbate bias in modelling.
Uncertainties with the modified model (Equation (5)) could be addressed by implementing advanced statistical and stochastic techniques that deal with outliers and high-dimensional or redundant data. These techniques can extract information from the data before training the model to effectively deal with bias. For example, in predicting nitrogen, phosphorus, and sediment mean concentrations in urban runoff, Behrouz et al. (2022) made use of RF, an algorithm known for its ability to handle noisy data and outliers. Lee et al. (2021) used cross-validation in the development of their debris flow volume model. Their study randomly partitioned the data associated with the four prominent variables into 10 subsets of approximately equal size, with a 7:3 ratio for training and validation datasets. They then trained their model on the training data and evaluated its performance on the validation data. This process was repeated 10 times, with each of the 10 subsets being used as the validation set once. The average performance of the model over the 10 iterations was then calculated to provide a more reliable estimate of its performance on unseen data. By using cross-validation, Lee et al. (2021) ensured that their model was valid and unbiased, and that it was not overfitting to a particular subset of data.
Most traditional statistical models such as linear regression, logistic regression, and analysis of variance (ANOVA) are known to be sensitive to outliers and biased towards certain groups within the variables. This implies that the choice of modelling technique can affect the accuracy and validity of the models. According to Maharana et al. (2022), the use of more representative and robust data preprocessing techniques can effectively address missing data, bias, and data quality issues in solid deposition modelling.
ADDRESSING THE CONCERNS OF USING HYBRID MODELS IN GULLY POT SOLIDS DEPOSITION PREDICTION
Hybrid models that incorporate data-driven techniques have been recognised to handle uncertainties where traditional models may struggle (Post et al. 2015; Lee et al. 2021). However, these models require intricate and resource-intensive computation (Figure 4) and may be unexplainable due to their black box nature, posing the risk of model misapplication. As suggested in Section 4.1 and Table 4, the use of algorithms such as PDP, mRMR, PCA, FURIA-GA, RT has demonstrated effectiveness in model misapplication by selecting features during model development.
Overview of methods for explaining ‘black box’ models, their corresponding algorithms and benefits
Method . | Algorithms . | Benefits . |
---|---|---|
Simplification | L1 or L2 regularisation (Ng 2004), sigmoid function (Cramer 2003), modified tanh (Simos & Tsitouras 2021), smooth function approximation (Shurman 2016; Ohn & Kim 2019), weight sharing (Pham et al. 2018) | Reduces complexity of the model and the number of hidden layers, simplifies the activation function, and prevents overfitting |
Layer-wise explanation | Gradient-weighted Class Activation Mapping, Grad-CAM (Selvaraju et al. 2016), Layer-wise Relevance Propagation, LRP (Bach et al. 2015), integrated gradients (Hsu & Li 2023) | Analyse the output of each layer in a NN to gain insights into the behaviour of the network. Thus, identifying importance of various layers |
Model-agnostic interpretation | Local Interpretable Model-Agnostic Explanations, LIME (Ribeiro et al. 2016), Shapley Addictive exPlanations, SHAP (Lundberg & Lee 2017), Recursive Feature Elimination, RFE (Guyon et al. 2002), Principal Component Analysis, PCA (Abdi & Williams 2010), mutual information (Shannon 1948), partial dependence plot, PDP (Friedman 1991), permutation feature importance, PFI (Breiman 2001) | Visual feature importance insights that explain the behaviour of complex models regardless of the model's architecture. Identify variables that are most important for producing a given output and provide insights into correlation between variables, and the behaviour and transparency of the model |
Tree-based explanation | Classification and RTs such as CART, ID3, C4.5, CHAID, MARS, RF, GBT (Loh 2008; Hannan & Anmala 2021) | Inherently interpretable ML model that can be used in conjunction with other XAI tools |
Method . | Algorithms . | Benefits . |
---|---|---|
Simplification | L1 or L2 regularisation (Ng 2004), sigmoid function (Cramer 2003), modified tanh (Simos & Tsitouras 2021), smooth function approximation (Shurman 2016; Ohn & Kim 2019), weight sharing (Pham et al. 2018) | Reduces complexity of the model and the number of hidden layers, simplifies the activation function, and prevents overfitting |
Layer-wise explanation | Gradient-weighted Class Activation Mapping, Grad-CAM (Selvaraju et al. 2016), Layer-wise Relevance Propagation, LRP (Bach et al. 2015), integrated gradients (Hsu & Li 2023) | Analyse the output of each layer in a NN to gain insights into the behaviour of the network. Thus, identifying importance of various layers |
Model-agnostic interpretation | Local Interpretable Model-Agnostic Explanations, LIME (Ribeiro et al. 2016), Shapley Addictive exPlanations, SHAP (Lundberg & Lee 2017), Recursive Feature Elimination, RFE (Guyon et al. 2002), Principal Component Analysis, PCA (Abdi & Williams 2010), mutual information (Shannon 1948), partial dependence plot, PDP (Friedman 1991), permutation feature importance, PFI (Breiman 2001) | Visual feature importance insights that explain the behaviour of complex models regardless of the model's architecture. Identify variables that are most important for producing a given output and provide insights into correlation between variables, and the behaviour and transparency of the model |
Tree-based explanation | Classification and RTs such as CART, ID3, C4.5, CHAID, MARS, RF, GBT (Loh 2008; Hannan & Anmala 2021) | Inherently interpretable ML model that can be used in conjunction with other XAI tools |
It is imperative to understand why and when stakeholders need insights from the ‘black box’ models that are used in predicting the performance of variables in solids deposition. These include the needs for informed stakeholder decision-making, directed future data collection planning, data troubleshooting, informed feature extraction and anomaly detection and embedding trust.
Data troubleshooting plays a crucial role due to the prevalence of ‘dirty’ data, potential errors in preprocessing code, and the risk of target leakage which occurs when the training data contains information about the target, but similar data will not be available during model prediction. This can adversely impact the overall performance of the model as shown in Post et al. (2016)’s robust outlier detection regime while developing their hybrid model for solids deposition in gully pots. Understanding the patterns identified by models allows for the identification and resolution of errors. Additionally, an understanding of model-based insights will enable feature extraction, which is achieved by creating new features from raw data or existing features. These insights become important when dealing with large datasets or lacking domain knowledge. By selecting or designing features that align with domain knowledge, the resulting model becomes more transparent and easier to explain to non-experts. This can be particularly important when the model's predictions impact critical decisions or require justification to gain trust from stakeholders. Lack of transparency in ‘black box’ models can pose a challenge in stakeholder decision-making, raise ethical concerns and possible discriminatory outcomes, potentially preventing specific groups from accessing opportunities. For example, a county that relies solely on data-driven systems to manage gully pot cleansing may disregard human contributions, potentially leading to a reduction in funds allocated to a gully jetting company responsible for routine and reactive cleansing of the county's gullies.
In the context of human decision-making, model insights hold significance as they can inform decisions made by individuals, sometimes surpassing the importance of predictions. There are also growing concerns about the autonomy of ML systems in their ability to take decisions and actions without inputs from human oversight, established deterministic theories, and conceptual thinking (Subías-Beltrán et al. 2022). These concerns underline the need for explainable models allowing humans to understand how they work and provide insights into the decisions made. This is important where decisions that are based on ML models can have consequences such as discriminatory outcomes. Furthermore, insights from models can guide future data collection efforts, helping local councils determine which types of data are most valuable for solid deposition management and investment.






A random example of a solids level inspection data showing climatic and anthropogenic variables
Road hierarchy . | Solids type . | Season . | Rainfall intensity (mm/hour) . | Dry period (days) . | Landuse . | Solids level . |
---|---|---|---|---|---|---|
Service | Silt | Winter | 15.63 | 0 | Residential | 75% |
Lane | Leaves | Summer | 0.2 | 6 | Agricultural | 50% |
Service | Silt | Autumn | 9.3 | 4 | Residential | 50% |
Strategic | Leaves | Winter | 0.2 | 6 | Residential | 50% |
Minor | Silt | Spring | 15.63 | 0 | Recreational | 100% |
Road hierarchy . | Solids type . | Season . | Rainfall intensity (mm/hour) . | Dry period (days) . | Landuse . | Solids level . |
---|---|---|---|---|---|---|
Service | Silt | Winter | 15.63 | 0 | Residential | 75% |
Lane | Leaves | Summer | 0.2 | 6 | Agricultural | 50% |
Service | Silt | Autumn | 9.3 | 4 | Residential | 50% |
Strategic | Leaves | Winter | 0.2 | 6 | Residential | 50% |
Minor | Silt | Spring | 15.63 | 0 | Recreational | 100% |
For example, if the model output (solids level) with land use ( is 75%-filled and without land use (
, 50%-filled, this then implies that land use contributes 25%, which is otherwise known as marginal value. The same process is repeated for each possible combination of subsets which are additionally weighted
according to how many variables of the total number of variables (
) are in the subset.

The SHAP summary plot enhances the explainability of ‘black box’ models (modified from SHAP 2018).
The SHAP summary plot enhances the explainability of ‘black box’ models (modified from SHAP 2018).
Geng et al. (2022) used SHAP values to demonstrate the importance and correlation of various explanatory variables in predicting litterfall production, a crucial solid build-up process. Similarly, Russo et al. (2023) used a combination of ‘black box’ algorithms which include RF, support vector regression, and ANN, to predict sediment and nutrient first flush. The study used 76 potential predictive variables as input to the machine learning algorithm. The SHAP algorithm was then used to determine the feature importance of the variables and to improve the interpretability and explainability of the ‘black box’ models.
Likewise, classification trees are simple and interpretable models that visually represent the decision-making process of a ‘black box’ model and explain how the model arrived at a specific prediction. Rietveld et al. (2020b) used RTs in explaining the significance and correlation between SB, wash-off, and retention predictors in predicting solid accumulation rate in gully pots.
A suggested workflow for deploying an explainable hybrid model that can effectively predict solids deposition in a gully pot.
A suggested workflow for deploying an explainable hybrid model that can effectively predict solids deposition in a gully pot.
CONCLUSION AND FUTURE WORK
Traditional models have been used to estimate the deposition of solids in gully pots, but these methods have limitations. It has been demonstrated that explainable hybrid models can lessen the effects of these limitations.
This study offers a promising approach to overcome the limitations of traditional models in simulating complex systems such as SB, wash-off, and retention processes in gully pots. By integrating traditional and data-driven models, hybrids are produced to handle complex and non-linear variables, improve the scope and applicability of existing models, increase their flexibility, and reduce bias from non-representative data, missing data, and outliers. However the resource-intensive computation requirements and lack of explainability of hybrid models can lead to misapplication and flawed decision-making. There is a need for resource-efficient and explainable hybrid models that allow stakeholders to understand how a model works and why it takes certain decisions. SHAP values, DT, and other explainable Artificial Intelligence (XAI) tools can enhance the interpretability and explainability of ‘black box’ models, enabling stakeholders to make informed decisions based on reliable insights. By adopting these XAI tools, we can mitigate the risks associated with hybrid models and ensure that they are transparent, ethical, and beneficial. As explainable hybrids evolve, they will become an increasingly valuable tool for addressing complex modelling challenges in solids deposition on road surfaces and in urban stormwater management.
Future works will utilise explainable hybrid architecture to improve the predictive accuracy of solids deposition using gully inspection data from multiple local authorities.
AUTHORS’ CONTRIBUTIONS
C.F.E. contributed to conceptualisation, methodology, and writing. A.C. contributed to writing, review, and supervision. H.B. contributed to review and supervision. E.E. and C.S. reviewed the article.
FUNDING
There was no external funding for this research.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.