## Abstract

In the paper, a comparison of prediction results concerning the annual number of discharges of stormwater from the drainage system due to stormwater overflows is depicted. The prediction has been computed by means of storm water management model (SWMM) and probabilistic models. Regarding the probabilistic modelling some simple statistical models such as logit, probit, Gompertz and linear discriminant analysis model have been applied, and as for the hydrodynamic modelling a generator of synthetic rainfall based on the Monte Carlo method has been used. The analyses conducted has shown that logit, probit and Gompertz models give outputs that are comparable with the results of hydrodynamic modelling and are concordant with observations. Whereas the annual number of stormwater discharge predicted by the linear discriminant analysis model is significantly lower than the number obtained by hydrodynamic modelling. The calculations made have confirmed the possibility of using statistical models as an alternative for developing labour-consuming and complex hydrodynamic models. The statistical models can be used successfully to predict the stormwater overflows operation provided that the measurements of rainfall in the catchment and of filling the overflow are available.

## INTRODUCTION

The Water Framework Directive (2000/60/EC 2000) requires all the EU member states to protect and improve the environmental condition of all waters. As a result, for the sake of pollution prevention, it is necessary to incorporate quantitative or qualitative criteria when dealing with storm overflows in stormwater drainage systems. One of such criteria is the permissible number of stormwater overflow discharges in a year. The assessment of the operating frequency can be based on the results of multi-year measurements. These, however, generate high costs and refer to already existing facilities. Therefore, hydrodynamic models are increasingly used in practice (Thorndahl & Willems 2008; Thorndahl 2009). The computational process account for temporal and spatial variation of parameters that are of key importance in generating the runoff from a catchment (including direction and speed of the inflow of moist air masses due to atmospheric fronts, and soil humidity), and thus for the triggering of the overflow. Nevertheless, to make hydrodynamic modelling results reliable, it is necessary to calibrate a model using a multi–year, high resolution measurement series concerning precipitation and stormwater drainage system flows. These data are not always available. An alternative to hydrodynamic models can be provided by statistical models (Thorndahl *et al.* 2008). They are most frequently employed in situations in which the dependent variable is of qualitative character and can take zero/one values. That explains the usability of such models when describing the probability (*p*) of an event's occurrence or absence of it. The models, which include, for example, linear discriminant analysis model (LDA), logit, probit and Gompertz, are particularly widely used in economic, social and medical sciences (Berger 1981; Baty & Delignette-Mulle 2004; Gil *et al.* 2006; Bergua *et al*. 2008; Ebrahimzadeh *et al*. 2015; Galán *et al.* 2015). They are far less frequently employed for typically engineering problems in applied hydrology, geotechnics, geomorphology, wastewater treatment or prediction of wastewater inflow at the treatment plant (Ayalew & Yamagishi 2005; Heyer & Stamm 2013).

The main objective of the presented study was to check whether it is possible to use statistical models as alternatives to complex and tedious-to-develop hydrodynamic models when assessing the operation of storm overflows. However, the assessment of the influence of the number of rainfall events, which have a random nature, on the number of the overflow events was a sub-goal.

The article is divided into two parts. The first part describes the proposed algorithm for determining probabilistic model for predicting the annual number of overflows, and the calibration of the storm water management model (SWMM) of storm overflow. In the second part the calibration of the SWMM model is conducted, and the individual steps of the algorithm are presented using example data. As part of the analyses, at the stage of probabilistic model development, statistical models for predicting overflow system operation have been determined using measurements of precipitation, flows, and storm overflow operation. The empirical distributions of independent variables in the statistical models have been determined and their theoretical distributions have been assigned. These data have been the basis for the creation of precipitation generator using Monte Carlo method. In the next stage, the models for predicting overflow system operation and the developed precipitation generator have been employed for the calculation of annual number of stormwater overflow discharges using the probabilistic model. In the last stage, the calculation results obtained with SWMM and probabilistic models have been compared.

## CASE STUDY

Urban catchment of the Si9 sewer, which has an area of 62 ha, constitutes the study area. It is located in the southeastern part of Kielce, Poland. The city is the capital of the Świetokrzyskie voivodeship, has an area of 109 km^{2}, and a population of 200,000 people. The average population density in the Si9 catchment is 21.4 people·ha^{−1} and it is greater than in the entire city (18.3 people·ha^{−1}). The catchment's land use mainly consists of residential and institutional buildings, streets and car parks (51.6%). The remaining part of the catchment (48.4%) include green areas (lawns and parks – 47.2%) and school playgrounds (1.3%). Road network density in the Si9 catchment is 180 m·ha^{−1}. Height difference in the study area is 11.2 m. The highest point is at an elevation of 271.2 m, and the lowest at 260 m. The average slope in the catchment is 0.71%. Total length of sewer network pipes in the catchment is 5.6 km, and their slope lengthwise is from 0.1% to 2.7%. The diameter of the pipes varies from 200 to 1,250 mm.

Stormwater collected by Si9 sewer is directed to the stormwater treatment plant (SWTP) via the diversion chamber (DC) (Figure 1). The treatment plant consists of the following: a horizontal settling tank (ST), a coalescence separator (SEP), and a control well (CW) – where stormwater samples are taken in order to evaluate SWTP operation in terms of removing suspension from stormwater stream. If the filling level of the DC does not exceed 0.42 m, the whole amount of stormwater is conveyed by four Ø 400 mm pipes to the ST. Stormwater from the ST flows simultaneously through a Ø 200 mm pipe to the SEP, and through two Ø 500 mm pipelines to the CW (to which stormwater from the SEP is also delivered), and finally is discharged to the receiver, i.e. the river Silnica. If, however, the DC filling exceeds 0.42 m, a portion of stormwater is conveyed directly (without treatment), via the storm overflow (OV) and overflow channel, to the river Silnica.

In the years 2009–2011, continuous measurements of stormwater amounts leaving the catchment as well as the measurements of the DC filling were conducted. The measurements of the amount of stormwater were made by means of a Teledyne ISCO 2150 flow meter installed on the Ø 1.250 mm pipe at a distance of 3.0 m from the inlet to DC. The level of stormwater in DC was measured using an ultrasonic probe with a frequency of 30 seconds.

## MATERIALS AND METHODS

### Precipitation data

The assessment of the storm overflow operation has been performed based on rainfall series for the 2008–2016 period. The rainfall data were collected at a precipitation station, which was put into operation in 2008, and is located 2 km north of the Si9 sewer catchment borders. Until 2014, it was the only facility in Kielce that measured precipitation depth. Observations, which have time resolution of 5 min and tolerance of 0.1 mm were recorded there continuously. In accordance with the hydraulic dimensioning and verification of drain and sewer systems standard (DWA-A 118 2006), based on precipitation time series analysis, it has been assumed that the minimal inter-event time was equal to 4 h. Consequently, 457 individual precipitation events have been distinguished, the duration of which (t_{d}) ranged from 20 to 2,366 min. The inter-event time (t_{bd}) varied from 230 to 86,400 min (60 days). The total rain depth (P_{c}) of the events identified in the period of concern ranged between 1.3 and 45.2 mm. The maximum 15 min precipitation depth was P_{(t=15)} = 0.3–20.4 mm, whereas the maximum 30 min precipitation depth was P_{(t=30)} = 0.3–21.2 mm.

Characteristics (P_{c}, P_{(t=15)}, P_{(t=30)}, t_{d}, t_{bd}) and their statistical distributions have been determined (empirical distribution functions have been derived) for the precipitation events in the years 2008–2016. The theoretical distribution of the density function with the best fit to the empirical data has been selected from among widely applied two- and three-parameter distributions (Balistrocchi *et al.* 2008; Andrés-Doménech *et al.* 2010; Paola & Martino 2013). They include beta, chi-square, exponential, gamma, log-normal, Weibull, Fisher–Tippet, GEV, and Pareto distributions. In order to assess the congruence between the empirical and theoretical distribution, the results of Kolmogorov–Smirnov and chi-square tests, both at the significance level of *α* = 0.05, have been used.

### Probabilistic model for the prediction of the annual number of stormwater overflow discharges

In order to determine the number of stormwater overflow discharges, a probabilistic model has been developed. Its calculation scheme is presented in Figure 2. The approach assumes the usage of statistical models for the prediction of overflow discharge as an alternative to catchment's numerical model. The model has been developed based on the results of continuous measurements of precipitation, flow, and filling (1).

Based on the precipitation measurements rainfall events and their characteristics have been determined (2). These data have been used for obtaining statistical models, for the identification of independent variables (x_{i}) having the influence on storm overflow operation (3), and for the determination of parameter values fitting the simulation results to the measurements (3). Additionally, in order to evaluate the prediction ability of statistical models they have been used for calculating the annual number of overflow discharges during the determined rainfall events in each year. The results have been compared to the results of the simulations using the SWMM model (4).

Due to the measuring period being not long enough, Monte Carlo method (MC) has been used for the simulation of rainfall events. Therefore, empirical distributions of independent variables x_{i} have been established (5). Subsequently, theoretical distributions have been fit to the empirical distributions, and Kolmogorov–Smirnov test has been used for assessing the agreement between calculations results and measurements (6). In practice, it is common to assume that the number of rainfall events is the arithmetic mean for the analysed period. It is also included in the developed probabilistic model (Figure 2, case C1).

Based on the determined theoretical distributions and statistical models, a T-times MC simulation of the number of annual rainfall events has been conducted. Afterwards an N-times prediction of independent variables (precipitation characteristics) for the generated number of rainfall events has been performed. This complies with the influence of the stochastic character of the number of rainfall events on the annual number of overflow discharges (Figure 2, case C2). Obtained results (x_{i}) of the MC simulation have been an input for the *p* = f(x_{i}) relationship and the annual number of the events in rainfall events has been established (7). Based on the above, a cumulative distribution describing the probability of exceeding the annual number of discharges has been determined.

The results of the annual number of discharges obtained from the probabilistic model have been compared to the results of continuous simulations performed using a calibrated SWMM model of the catchment (9–10) based on rainfall series for the 2008–2016 period.

### SWMM model

In the study, the calibrated SWMM model of the catchment containing SWTP and DC (Szeląg *et al.* 2016; Szeląg & Bąk 2017) has been used. It was composed of 92 sub-catchments (ranging in area from 0.12 ha to 2.10 ha), 200 manholes, and 72 nodes and conduits (Figure 1). In order to develop and calibrate the hydraulic model of the DC (see the network in Figure 1), the object design documentation, the measurements concerning the flow in Si9 sewer as well as the DC filling has been used. The calibration of the pipelines discharging stormwater from the DC to the ST has been made for two filling (h) ranges, i.e. for h = 0.00–0.42 m and for h > 0.42 m. In the first step of calculation (for h = 0.00–0.42 m) the values of local resistance coefficients at inlets, outlets, and arcs of the pipelines have been defined based on the literature data (Idel'chik 1996). The calibration of these parameters has been made by trial and error method until the calculation results were in a satisfactory compliance with the DC filling measurements. In the second step of calculation (for h > 0.42 m) the values of overflow coefficient have been calibrated in the same way as in the previous case. The calibration approach described is in accordance with other published works (Fach *et al.* 2008). The following measures have been used to validate the conformity of the modelling results to the filling measurements:

- relative prediction error of maximal filling (
*δ*)where h_{SCm}, h_{SCc}(m) are the maximum measured and computed filling level of the DC, h_{m(i)}(m) is the measured value of the DC filling, h_{c(i)}(m) is the computed value of DC filling, h_{avgm}(m) is the mean measured DC filling, h_{avgc}(m) is the mean computed DC filling and n is the number of observations.

To calibrate the model, precipitation time series and the measurements of DC filling for the years 2008–2011 have been used. The data from the period mentioned above include 188 rainfall events out of which 69 were cases of occurence of a stormwater overflow discharge. To validate the model, 10 events have been randomly selected from the period of concern. Among them, there were both events in which overflow occurred and events in which it did not take place.

### Prediction of stormwater overflow using classification models

On account of time, costs and troubles regarding the development of hydrodynamic models, some classification models have been computed to predict the operation of stormwater overflow. Since the parameters estimated in logit models are highly uncertain, which influences negatively the prediction of storm overflow operation (Szeląg & Bąk 2017), the search for other classification models, which ensure high compliance of calculation results with the measurements, has been carried out. The models tested and compared with logit model are probit, Gompertz and LDA (linear discriminant model) models. These models are described by simple regression relations and are implemented in commonly used software packages, which makes them available to a wide range of users. The following equations describe the models tested in the paper to predict the stormwater overflow operation (Harrell 2015; Galán *et al.* 2015):

- linear discriminant model (LDA)where
*p*is the probability of overflow discharge occurrence,*k*is the number of classes to be separated – in the problem of concern,*k**=*2, X is a vector being a linear combination of independent variables calculated using Equation (8),*x*are the variables describing the overflow operation (total depth of precipitation_{i}*P*(mm), maximum instantaneous precipitation depth in the event_{c}*P*(mm), duration of rainfall_{t(i)}*t*(min), length of inter–event time*t*(min))._{bd}

In the logit, probit and Gompertz models, parameters *β*_{0} and *β*_{i} have been determined with the Maximum Likelihood Estimation method. In this method, the value of the logarithm of the likelihood function is maximised with respect to the model parameters, which is done by using appropriate numerical models. In the LDA model, however, these values are determined with the least squares method.

The probability limit value of *p* = 0.50 was the criterion for the assessment of overflow discharge occurrence in the logit, probit and Gompertz models. It corresponds to the set of variables describing a particular precipitation event. Among the models under consideration, this corresponds with those of multidimensional dependencies:

As for the probit model, finding a combination of variables for which a discharge occurs is a much more complex task. It requires running iterative computations due to the fact that it is necessary to numerically integrate the dependence expressed by Equation (5). In the linear discriminant model, the overflow discharge takes place when the value of discriminant function Z_{k1} > Z_{k2}. In the models, only statistically significant parameters have been taken into account, with a significance level of 5%. To assess the prediction ability of the models developed, the following performance measures have been used:

Classification . | Decisions predicted . | ||
---|---|---|---|

Positive . | Negative . | ||

Decisions observed | positive | true positive (TP) | false negative (FN) |

negative | false positive (FP) | true negative (TN) |

Classification . | Decisions predicted . | ||
---|---|---|---|

Positive . | Negative . | ||

Decisions observed | positive | true positive (TP) | false negative (FN) |

negative | false positive (FP) | true negative (TN) |

To identify the values of the parameters of statistical models described in Equations (4)–(8), the measurements collected in years 2008–2011 have been used. They concern 188 rainfall events and 65 cases of stormwater overflow discharge. The analyses performed also included 93 measurements of maximal filling of the DC collected in the years 2012–2014, after which 42 cases of stormwater occurred. Eventually, 107 events when a discharge occurred and 174 events without a discharge have been identified. The validation of the models developed has been made based on the measurements of 10 independent rainfall events (five events with a discharge, and five without one).

## RESULTS AND DISCUSSION

### Calibration, validation of the SWMM model

The calibration and validation results of modelling the DC for 10 rainfall events are shown in Table 2. Table 2 shows that, in the case of calibration, the calculation results concerning the DC filling fit the measurements data at a satisfactory level (*δ* = 0.90–0.97, R = 0.88–0.96, MAPE = 4.00–7.60% and the calculated maximal filling values are not higher than 11% above the measurements data). It is confirmed by the validation results where *δ* = 0.89–0.98, R = 0.90–0.96 and MAPE = 3.10–7.30%. Meanwhile, the conducted simulations and measurements show that an overflow discharge happens when the amount of stormwater drained from the catchment exceeds 0.18 m^{3}·s^{−1}**.**

Rainfall characteristics . | Matching parameters . | ||||||
---|---|---|---|---|---|---|---|

Datum . | P_{c} mm
. | t_{d} min
. | q dm^{3.}ha^{−}^{1}·s^{−}^{1}
. | ID . | δ
. | R . | MAPE % . |

08.07.2011 | 8.6 | 60 | 23.89 | VAL | 0.89 | 0.90 | 6.5 |

15.09.2010 | 9.2 | 286 | 5.36 | CAL | 0.95 | 0.85 | 9.8 |

30.07.2010 | 12.5 | 107 | 19.47 | CAL | 0.97 | 0.91 | 4.7 |

07.25.2009 | 10.3 | 960 | 1.79 | VAL | 0.98 | 0.96 | 3.1 |

10.12.2009 | 6.0 | 300 | 3.33 | CAL | 0.97 | 0.98 | 4 |

08.07.2009 | 16.5 | 270 | 10.19 | VAL | 0.94 | 0.86 | 6.7 |

03.08.2009 | 4.2 | 26 | 26.93 | CAL | 0.96 | 0.88 | 6.8 |

31.05.2010 | 5.4 | 56 | 16.07 | VAL | 0.91 | 0.87 | 7.3 |

26.04.2010 | 3.6 | 92 | 6.52 | VAL | 0.92 | 0.90 | 7.1 |

12.04.2011 | 4.6 | 85 | 9.02 | CAL | 0.90 | 0.89 | 6.2 |

31.05.2010 | 5.4 | 56 | 16.70 | VAL | 0.91 | 0.87 | 7.3 |

26.04.2010 | 3.6 | 92 | 6.52 | VAL | 0.92 | 0.90 | 7.1 |

12.04.2011 | 4.6 | 85 | 9.02 | CAL | 0.90 | 0.89 | 6.2 |

06.04.2010 | 6.3 | 106 | 5.94 | VAL | 0.87 | 0.87 | 7.8 |

10.23.2009 | 6.6 | 176 | 3.75 | VAL | 0.82 | 0.84 | 10.7 |

07.19.2009 | 9.6 | 693 | 1.39 | VAL | 0.97 | 0.88 | 6.2 |

07.25.2011 | 9.8 | 400 | 2.45 | CAL | 0.94 | 0.91 | 4.9 |

09.29.2010 | 5.6 | 512 | 1.09 | CAL | 0.93 | 0.90 | 6.8 |

10.11.2011 | 6.2 | 380 | 1.63 | CAL | 0.94 | 0.92 | 4.5 |

03.18.2010 | 8.1 | 740 | 1.09 | CAL | 0.92 | 0.93 | 4.2 |

Rainfall characteristics . | Matching parameters . | ||||||
---|---|---|---|---|---|---|---|

Datum . | P_{c} mm
. | t_{d} min
. | q dm^{3.}ha^{−}^{1}·s^{−}^{1}
. | ID . | δ
. | R . | MAPE % . |

08.07.2011 | 8.6 | 60 | 23.89 | VAL | 0.89 | 0.90 | 6.5 |

15.09.2010 | 9.2 | 286 | 5.36 | CAL | 0.95 | 0.85 | 9.8 |

30.07.2010 | 12.5 | 107 | 19.47 | CAL | 0.97 | 0.91 | 4.7 |

07.25.2009 | 10.3 | 960 | 1.79 | VAL | 0.98 | 0.96 | 3.1 |

10.12.2009 | 6.0 | 300 | 3.33 | CAL | 0.97 | 0.98 | 4 |

08.07.2009 | 16.5 | 270 | 10.19 | VAL | 0.94 | 0.86 | 6.7 |

03.08.2009 | 4.2 | 26 | 26.93 | CAL | 0.96 | 0.88 | 6.8 |

31.05.2010 | 5.4 | 56 | 16.07 | VAL | 0.91 | 0.87 | 7.3 |

26.04.2010 | 3.6 | 92 | 6.52 | VAL | 0.92 | 0.90 | 7.1 |

12.04.2011 | 4.6 | 85 | 9.02 | CAL | 0.90 | 0.89 | 6.2 |

31.05.2010 | 5.4 | 56 | 16.70 | VAL | 0.91 | 0.87 | 7.3 |

26.04.2010 | 3.6 | 92 | 6.52 | VAL | 0.92 | 0.90 | 7.1 |

12.04.2011 | 4.6 | 85 | 9.02 | CAL | 0.90 | 0.89 | 6.2 |

06.04.2010 | 6.3 | 106 | 5.94 | VAL | 0.87 | 0.87 | 7.8 |

10.23.2009 | 6.6 | 176 | 3.75 | VAL | 0.82 | 0.84 | 10.7 |

07.19.2009 | 9.6 | 693 | 1.39 | VAL | 0.97 | 0.88 | 6.2 |

07.25.2011 | 9.8 | 400 | 2.45 | CAL | 0.94 | 0.91 | 4.9 |

09.29.2010 | 5.6 | 512 | 1.09 | CAL | 0.93 | 0.90 | 6.8 |

10.11.2011 | 6.2 | 380 | 1.63 | CAL | 0.94 | 0.92 | 4.5 |

03.18.2010 | 8.1 | 740 | 1.09 | CAL | 0.92 | 0.93 | 4.2 |

VAL – validation, CAL – calibration, ID – event id.

### Development and validation of statistical models for predicting stormwater overflow system operation

Based on the developed algorithm of probabilistic model (Figure 2), after the rainfall events has been identified, statistical models for predicting stormwater overflow system operation have been determined. With the usage of the measurements concerning the filling level of DC and the precipitation data, the parameters of statistical models (logit, probit, Gompertz and LDA) described by the Equations (4)–(8) have been calculated. Model validation results and the values of the estimated parameters are presented in Table 3. The analysis of the data listed in the table shows that the greatest impact on the overflow operation is created by the maximum 30 min precipitation depth (Szeląg & Bąk 2017).

Variable . | Model . | ||||||||
---|---|---|---|---|---|---|---|---|---|

Logit . | Probit . | Gompertz . | |||||||

Value (β_{i})
. | Standard deviation . | p
. | Value (β_{i})
. | Standard deviation . | p
. | Value (β_{i})
. | Standard deviation . | p
. | |

β_{0} | −6.880 | 0.832 | 0.00001 | −3.168 | 0.274 | 0.00001 | −3.420 | 0.382 | 0.00001 |

P_{t=30} | 3.381 | 0.439 | 0.00001 | 1.503 | 0.140 | 0.00001 | 1.904 | 0.226 | 0.00001 |

Parameters describing the fit of the model | |||||||||

SPEC | 97.14% | 97.84% | 96.67% | ||||||

SENS | 87.50% | 84.62% | 88.39% | ||||||

R^{2}_{z} | 93.79% | 92.80% | 93.79% |

Variable . | Model . | ||||||||
---|---|---|---|---|---|---|---|---|---|

Logit . | Probit . | Gompertz . | |||||||

Value (β_{i})
. | Standard deviation . | p
. | Value (β_{i})
. | Standard deviation . | p
. | Value (β_{i})
. | Standard deviation . | p
. | |

β_{0} | −6.880 | 0.832 | 0.00001 | −3.168 | 0.274 | 0.00001 | −3.420 | 0.382 | 0.00001 |

P_{t=30} | 3.381 | 0.439 | 0.00001 | 1.503 | 0.140 | 0.00001 | 1.904 | 0.226 | 0.00001 |

Parameters describing the fit of the model | |||||||||

SPEC | 97.14% | 97.84% | 96.67% | ||||||

SENS | 87.50% | 84.62% | 88.39% | ||||||

R^{2}_{z} | 93.79% | 92.80% | 93.79% |

^{2}

_{z}= 81.6%) indicate that the model can predict the overflow discharge with satisfactory adequacy. Additionally, approximately half of the events, during which the discharge did not occur, was classified correctly. The discrimination functions have the following form:where Z

_{1,2}is the classification functions that condition the discharge occurrence, or the absence of it, respectively, P

_{(t=30)}–depth of 30 min precipitation.

The data in Table 3 show that the Gompertz model has correctly classified 99 out of 102 cases of overflow discharges observed (SPEC = 96.67%). Out of 169 events, during which stormwater was not discharged through the overflow, 149 events have been rightly classified (SENS = 88.39%). The results obtained using logit and probit models are slightly different from those produced by the Gompertz model (similar values of SENS, SPEC and R^{2}_{z}). To summarize, the models developed in the study have satisfactory predictive abilities with respect to the probability of occurrence of overflow discharges. The model validation performed has shown that in five cases of stormwater discharges occurred the model has identified correctly five events and in five cases, when the discharge events did not occur, the model identified correctly all the events. The linear discriminant model has produced far worse results as it correctly classified only 56 out of 102 events of the occurrence of overflow discharge (SENS = 55.2%). Among the models examined in the study, the LDA model showed the worst predictive abilities.

Figure 3(a) has been plotted to visualize the curves described by Equations (4)–(6). It can be seen that an overflow discharge occurs (*p*(P_{(t=30)}) > 0.5) when the precipitation depth for the 30 min event exceeds the value of 2.03 mm in the logit model, 2.12 mm in the probit model, and 1.98 mm in the Gompertz model.

### Rainfall events simulator

Using the above results and proceeding according to the calculation scheme of the probabilistic model presented in Figure 2, an empirical distribution for the maximum 30 min amount of precipitation has been determined. In the next step, a theoretical distribution, which results in the best calculations-to-measurements fit, has been selected. The developed theoretical cumulative distribution f(P_{t=30}) has allowed for the simulation of the P_{t=30} value using a random number generator (Monte Carlo method). The analyses have shown that the log-normal distribution have the following parameters: *μ* = 0.507 mm and *δ* = 0.905 mm is best-fitted to empirical data (Table 4). The profiles of the empirical and the theoretical distribution functions of the optimal distribution are compared in Figure 3(b).

Distribution . | Kolmogorov–Smirnov test . | Chi-square test . | ||
---|---|---|---|---|

D . | p
. | D . | p
. | |

beta | 0.182 | 0.003 | 39.256 | 0.004 |

chi-square | 0.123 | 0.002 | 27.456 | 0.002 |

Weibull | 0.074 | 0.032 | 22.478 | 0.028 |

expotential | 0.116 | 0.022 | 26.124 | 0.008 |

GEV | 0.412 | 0.003 | 64.214 | 0.001 |

Pareto | 0.384 | 0.003 | 51.542 | 0.001 |

Fisher–Tippett | 0.768 | 0.004 | 75.231 | 0.000 |

log-normal | 0.051 | 0.489 | 15.123 | 0.268 |

Distribution . | Kolmogorov–Smirnov test . | Chi-square test . | ||
---|---|---|---|---|

D . | p
. | D . | p
. | |

beta | 0.182 | 0.003 | 39.256 | 0.004 |

chi-square | 0.123 | 0.002 | 27.456 | 0.002 |

Weibull | 0.074 | 0.032 | 22.478 | 0.028 |

expotential | 0.116 | 0.022 | 26.124 | 0.008 |

GEV | 0.412 | 0.003 | 64.214 | 0.001 |

Pareto | 0.384 | 0.003 | 51.542 | 0.001 |

Fisher–Tippett | 0.768 | 0.004 | 75.231 | 0.000 |

log-normal | 0.051 | 0.489 | 15.123 | 0.268 |

### Comparison of the results of simulations of the annual number of overflow discharges

#### Comparison between the annual number of overflow discharges obtained using statistical models and SWMM model

On the basis of measured precipitation series (2008–2016) and the hydrodynamic model developed with the SWMM software, continuous simulations have been carried out and the number of overflow discharges has been determined. It has been found that the number of precipitation events in individual years ranged from 36 to 58, and the number of discharges ranged from 12 to 29 (Table 5). Table 5 also shows the annual number of stormwater overflow discharges (for the identified rainfall events) obtained using statistical models. The performed calculations confirm the analysis results presented in Figure 4. The logit, Gompertz, probit models are characterized by satisfactory ability of predicting stormwater overflow discharges. It is also confirmed in differences between predicted number of overflow discharges and measurements from each year (not greater than 5%). On the other hand, the lowest agreement between calculation results and measurements has been obtained with the LDA model.

Year . | Number of stormwater discharges determined with SWMM/observed . | Number of stormwater discharges determined with statistical models . | Number of precipitation events . | |||
---|---|---|---|---|---|---|

Logit . | Probit . | Gompertz . | LDA . | |||

2008 | 12/15 | 14 | 15 | 15 | 8 | 43 |

2009 | 16/16 | 15 | 15 | 15 | 8 | 47 |

2010 | 17/18 | 17 | 17 | 19 | 8 | 47 |

2011 | 19/20 | 19 | 21 | 19 | 9 | 51 |

2012 | 21/13 | 22 | 20 | 20 | 11 | 36 |

2013 | 22/13 | 21 | 23 | 21 | 10 | 41 |

2014 | 29/16 | 27 | 29 | 28 | 15 | 44 |

2015 | 26 | 26 | 26 | 25 | 14 | 58 |

2016 | 22 | 21 | 22 | 23 | 10 | 44 |

Year . | Number of stormwater discharges determined with SWMM/observed . | Number of stormwater discharges determined with statistical models . | Number of precipitation events . | |||
---|---|---|---|---|---|---|

Logit . | Probit . | Gompertz . | LDA . | |||

2008 | 12/15 | 14 | 15 | 15 | 8 | 43 |

2009 | 16/16 | 15 | 15 | 15 | 8 | 47 |

2010 | 17/18 | 17 | 17 | 19 | 8 | 47 |

2011 | 19/20 | 19 | 21 | 19 | 9 | 51 |

2012 | 21/13 | 22 | 20 | 20 | 11 | 36 |

2013 | 22/13 | 21 | 23 | 21 | 10 | 41 |

2014 | 29/16 | 27 | 29 | 28 | 15 | 44 |

2015 | 26 | 26 | 26 | 25 | 14 | 58 |

2016 | 22 | 21 | 22 | 23 | 10 | 44 |

#### Comparison of the annual number of overflow discharges obtained using SWMM model and probabilistic model

The developed statistical models for predicting stormwater overflow system operation (Equations (4)–(8)) and the established theoretical distribution describing the variability of the P_{t=30} value during the identified rainfall events have been used for performing the simulation of synthetic precipitation series using the MC method, and for the calculation of the annual number of stormwater overflow discharges for the generated yearly precipitation series (Figure 2). In accordance with the developed scheme, two calculation cases have been considered in this work. In the first case (Figure 2, C1), the average number of rainfall events in the period of concern has been assumed (*N* = const = 46). In the second case (Figure 2, C2), it has been assumed that the annual number of rainfall events varies (*N* = var) from 36 to 58 and is described by a uniform distribution. Results of the simulation of the annual number of discharges are presented in Figure 4. The creation of a cumulative distribution describing the probability of exceeding the annual number of stormwater overflow discharges has been limited due to a relatively short measurement period. Therefore, the results of simulation of the annual number of discharges for years 2008–2016 obtained using the SWMM model are juxtaposed with the probabilistic solution.

Figure 4(a) indicates that the distribution functions obtained on the basis of logit, probit and Gompertz models do not differ substantially, whereas for the LDA model, the number of overflow discharges is much lower. When analysing the determined CFD curves, it can be seen that the smallest number of overflow discharges have been produced using the LDA model, and the greatest using the Gompertz model. The simulations have also demonstrated that the expected value, corresponding to the quantity *p* = 0.50, produced by logit, probit and Gompertz models have ranged 22–24, whereas for the linear discriminant model it has been merely 14. The simulations have also indicated that the results of calculations of the annual number of discharges obtained using the SWMM model have been found to coincide with the range of probabilistic solutions produced by the binomial models.

Regarding the distribution function received, for the assumption about the average number of events in a year (*N* = const.), the analyses have shown that the expected number of stormwater overflow discharges is 24. However, if the *N* value is random, the most likely number of discharges equals 23 (Figure 4(b)). For the assumption that *N* = const., the *p* = 0.05 percentile value of the determined distribution function is 18, whereas when the event number is random N_{p}_{=0,05} = 13. In the case of *p* = 0.95, a reverse situation is observed. Furthermore, if the average number of precipitation events is assumed in the analysis, 30 overflow discharges a year are generated. If the stochastic character of the number of precipitation events is taken into account, an increase to 34 overflow discharges is found.

## CONCLUSIONS

Based on the computations, it has been found that logistic regression, Gompertz and probit models had satisfactory predictive powers of predicting stormwater overflow discharges. The linear discriminant model has shown the worst classification abilities. As a result, the logit, probit and Gompertz models can be successfully applied to practical problems concerning the operation of the drainage system of the urban catchment discussed in the paper. They offer the advantage of removing the necessity of developing and calibrating a complex hydrodynamic model. For the latter model, it is necessary to take multi–year measurements of precipitation and of water flow in the drainage system. The annual number of overflow discharges obtained with the calibrated SWMM model is within the range of solutions received using the probabilistic model, which is a combination of the Monte Carlo method (precipitation generator) and a classification model. The model proposed in the paper and destined for prediction of the number of stormwater overflow discharges eliminates the need for the development of hydrodynamic models. Thus, it reduces significantly the costs of conducting multi-annual measurements of stormwater flow and the DC filling in the area of stormwater overflow. The calculation results presented in the paper concern an isolated catchment but the modelling methodology used can be also applied in other cases provided that rainfall measurements as well as data concerning the stormwater network is accessible. Due to the model's simplicity and the ease in estimating its parameters it is advisable to prepare one for an entire existing stormwater network (for stormwater overflows) and to conduct an assessment of this network's performance. Then, if the case of exceeding the maximum number of discharges is stated, network management should take actions aiming at limiting this number, for example, flow suppression and stormwater retention.

Among the parameters describing variation in rainfall during a precipitation event that were taken into account in the study, the maximum 30 min precipitation depth exerts the most significant impact on the overflow operation. Additionally, the predicted number of overflow discharges is influenced by the statistical distribution of the number of precipitation events in a yearly cycle. A yearly number of discharges, computed with the assumption about the stochastic annual number of precipitation events for *p* < 0.5, is greater than the one received with the assumption about uniform distribution. However, for *p* > 0.5 an inverse dependency has been revealed. Regarding practical applications, in the process of designing the underground facilities located in the stormwater drainage system, this fact should be taken into account and analysed so that an optimal solution can be found.