Data assimilation in hydrodynamic models for system-wide soft sensing and sensor validation for urban drainage tunnels

Tunnels are increasingly used worldwide to expand the capacity of urban drainage systems, but they are difficult to monitor with sensors alone. This study enables soft sensing of urban drainage tunnels by assimilating water level observations into an ensemble of hydrodynamic models. Ensemble-based data assimilation is suitable for non-linear models and provides useful uncertainty estimates. To limit the computational cost, our proposed scheme restricts the assimilation and ensemble implementation to the tunnel and represents the surrounding drainage system deterministically. We applied the scheme to a combined sewer overflow tunnel in Copenhagen, Denmark, with two sensors 3.4 km apart. The downstream observations were assimilated, while those upstream were used for validation. The scheme was tuned using a high-intensity event and validated with a low-intensity one. In a third event, the scheme was able to provide soft sensing as well as identify errors in the upstream sensor with high confidence.

The depth and scale of urban drainage tunnels also allow UDSs to minimize disruption at the surface level and achieve attractive cost-benefit ratios.
To take full advantage of their capacity, tunnels need to be optimally controlled. A control system includes monitoring equipment to measure the current state of the system, actuators to change the dynamics of the system and a control strategy aimed at reaching defined objectives.
It is essential to have as much knowledge of the state of the system as possible to take the right control decisions. The DUDMs rely on a large number of parameters, related to geometry, materials, etc., to be able to simulate the complex and non-linear behaviour of UDSs. Many of these parameters can be derived from the asset information, e.g. diameter and length of the pipes, others need to be calibrated. Deletic et al. () identify the calibration process as a key source of uncertainty in urban drainage modelling together with the model input, e.g. rainfall data and dry weather inflows, and the model structure. Moreover, the calibration is sensitive to the quantity and quality of avail- with gradually changing hydraulic conditions, they cannot produce uncertainty estimates and cannot easily assimilate ambiguous information from multiple sensors. For these reasons, ensemble-based methods are to be preferred over even more so in large drainage tunnels where sensors are difficult to maintain, due to their location deep underground in a dangerous and harsh environment. If a detailed hydrodynamic model correctly updated in real time is capable of producing reliable, probabilistic estimates throughout the entire system, it can serve as a soft (software) sensing tool to complement the hard (hardware) physical sensors. Independent observations can then be automatically validated against the model estimate, given our degree of confidence in the sensor and the updated model.
The aim of this study is to enable soft sensing and sensor validation of urban drainage tunnels by developing and validating a DA scheme tailor-made for tunnel models that can efficiently assimilate observations into an ensemble of models. Our proposed scheme restricts the assimilation only to a subsystem of the UDS, which includes the tunnel and its connected structures. By doing so, we significantly decrease the computational cost of the assimilation, while retaining the spatial resolution and level of detail of the fully distributed model.

Assimilation scheme
The application of ensemble-based DA to DUDMs requires running in parallel several instances of the same model. To limit the computational overhead, we develop a scheme that isolates an 'inner' model of the tunnel from the 'outer' DUDM, where the output from the outer model is used as the boundary condition for the inner model, and it is only this model that is updated. The inner model retains the same level of detail as the outer model and includes the tunnel itself and its inlet and outlet structures (e.g. overflows, dropshafts and pumping stations). Setting the inner model boundaries is a critical step in the implementation of the scheme. It is a case-specific procedure but should satisfy some general criteria: (i) the inner model should include the smallest number of structures besides the tunnel itself, to minimize the computational cost; (ii) the inner model should include all locations with sensors whose data is relevant for the assimilation; (iii) at the tunnel inlets, the boundaries should be set at locations where no backwater effects from the tunnel are expected, e.g. overflows; (iv) the outlet should include only the structures necessary to simulate the emptying of the tunnel, e.g. pumping stations and safety overflows.
To optimize for the first and second criteria, the boundaries are ideally set at locations where the sensors are placed. In principle, the boundary conditions could be either water levels or discharges. However, level gauges are preferred because they are usually more common and accurate than flow meters.
The scheme is implemented in three stages, as seen in Figure 1. In an offline setting, the stages are sequential, so that one stage is completed for the entire simulation period before moving to the next. In an online scenario, the sequence should be repeated at each time step. In the first stage, a single instance of the drainage model receives as input rainfall observations. From its output, the simulated states at the tunnel boundaries are read. The boundary conditions are perturbed according to a user-defined perturbation model in the second stage, to account for the uncertainty of the drainage model. In the final stage, each set of perturbed boundary conditions is linked to an identical copy of the tunnel model. Therefore, each member of the tunnel model ensemble returns a different prediction of the tunnel states, to represent the uncertainty of the model. Given the spread of the predicted tunnel states and the uncertainty assigned to the corresponding observations, the DA algorithm computes the updated states. The process is repeated at each time step and the updated states serve as the initial condition for the prediction at the following time step. If the observations are missing at a given point in time, the assimilation step is bypassed, and the predicted tunnel states are used directly as the initial condition for the following prediction.

1D Hydrodynamic models
DUDMs in state-of-the-art software such as Mike Urban (DHI ) include a lumped-conceptual surface module, that converts precipitation data into stormwater inflows, and a hydrodynamic module, that simulates the storage and conveyance of water in the system.

Ensemble-based data assimilation
The Kalman Filter is a sequential assimilation method, which means that the model propagates in time its internal variables or state vector x and is updated whenever measurements d are available. At time k, the state vector x k is given by the model M as: where u is the model input and e is the system noise, which represents the uncertainty of the model forecast. For hydrodynamic models, the vector x usually includes both water levels and discharges, i.e. all the h and Q variables shown in Figure 2.
If one or more observations are available, the analysis or updated states x a are computed as a weighted average of the model forecast or background states x f and the observations using the analysis equation: where H is a matrix mapping the model states to the observations and K is the Kalman gain matrix. The elements of K act as weights between the vectors x f and d. The higher the value of the gain, the closer the state is updated to the observation. The Kalman gain is calculated as: where P f is the forecast error covariance matrix and R is the observation error covariance matrix. If n is the size of the state vector, then P has the dimensions n 2 . The Kalman Filter requires P f to be propagated in time analytically, which is too costly an operation for large models, and is only valid for linear models.
To overcome these two limitations, Evensen () introduced the Ensemble Kalman Filter (EnKF), which uses an ensemble of model instances to propagate in time the error statistics. In the EnKF, the error covariance  matrix is replaced by the ensemble covariance, which is calculated as: is the ensemble size, X is the ensemble mean, and and (iii) calculating the analysed ensemble by adding the updated ensemble mean to the updated anomalies:

Prediction metrics
The user-defined perturbation model and the chosen DA  Table 1). Some of these are designed for deterministic forecasts, so they are applied to the ensemble mean.
The mean absolute error (MAE) and the root mean square error (RMSE) are commonly used measures of where X k is the mean across ensemble members at the k time step, d k is the corresponding observation and N is the total number of time steps in the observation period. Both metrics retain the same unit of the prediction, but the RMSE penalizes large errors, which in urban drainage are known to occur especially for peak flows and water levels that can be significantly affected by timing errors due to uncertain rain inputs.
The Nash-Sutcliffe efficiency (NSE) is calculated as: where d is the mean value of the observations over the observation period. NSE ranges between 1 and À∞, with the highest value being a perfect fit. Negative values indicate the ensemble mean is a worse predictor than the mean of the observations. NSE is a normalized metric, so it is useful to compare predictions across different events.
The coverage bias (CB) can be used to quantify the reliability of a probabilistic prediction. For a given quantile level β, the prediction interval is defined as: where X l k and X u k are, respectively, the lower and upper limits of the ensemble prediction at levels β=2 and 1 À β=2.
The observed coverage n(β) is the fraction of observations falling within I k (β) during the observation period and is expected to match the nominal coverage 1 À β. CB for a given quantile β is defined as the discrepancy between the nominal and observed coverage and ranges between À1 and 1 (Thordarson et al. ). If the bias is negative, the observed coverage is larger than the nominal one and the prediction is overestimating the uncertainty. The opposite is true if the bias is positive. Therefore, the CB can be used to quantify the reliability of a probabilistic prediction, since the ideal reliability corresponds to The average band width (ABW) for a given quantile level β is calculated as: and is an indicator of the prediction sharpness, although not sufficient to characterize it (Gneiting et al. ). In this study, it is computed to assess how the perturbation and assimilation affect the spread of the ensemble prediction.
The continuous ranked probability score (CRPS) for a given kth pair of observation-ensemble prediction is calculated as: where x is the predicted variable, F k (x) and F d k (x) are, respectively, the cumulative distribution function of the kth probabilistic prediction and of the corresponding observation (Hersbach ). When the observation is a unique value, its distribution function is given by: which is also known as the Heaviside function. To remove the time dependency, the continuous ranked probability score is averaged over the observation period in this study (simply referred to as CRPS). For a deterministic forecast, obs, ens ¼ σ 2 obs þ σ 2 ens . This implies that the standardized residuals should be normally distributed with zero mean and a variance of σ 2 obs ens , and that about 95% of the values should fall within the [À2,2] interval. Therefore, the standardized residuals could be used to test the validity of a sensor by checking if the mean of r std over a given time period is statistically different from zero and if its value exceeds a confidence interval of [À2,2] for prolonged periods in time. In the current article, we are only using r std for visual identification of suspicious sensor data. However, the validation process could be automated and the user could in a real-time setting be alerted when the sensor data does not pass the quality check.

Case study
The Damhus combined sewer overflow tunnel in Copenhagen, Denmark, was used as a case study. The tunnel is integrated within a large combined sewer system that drains to the local wastewater treatment plant. We were provided with an up-to-date and highly detailed MIKE URBAN model of the urban drainage system (Figure 3(a)).
This DUDM was forced with rainfall intensity series from

Observations
Since the tunnel receives overflows from the combined sewer, it is only employed during high-intensity rainfall events. Once the tunnel content has been pumped back to the sewer, sediments are removed by flushing. An 'event' here refers to a cycle of partial/complete filling, emptying and flushing. Eight events occurred in the period between August and December 2018. Among them, three characteristic events (A, B and C) were selected to cover various seasons (respectively, late summer, autumn and early winter) and a wide range of tunnel filling degrees (Figure 4).
The sensor data from these three independent events were used in this study.
Rainfall intensity observations from the 10 gauges linked to the drainage model were obtained at 1-min resolution from a national archive hosted by the Danish   The process is repeated until the end of the simulation period.

Implementation
The inner model was set to consist of the tunnel and the nearest connecting structures only (the dark-blue and red parts in Figure 3(b)-3(d)) with a total of 28 links. The inner model can simulate a 2-day event in 10 s that takes the model of the full system 2 h to simulate, corresponding to a ∼700 times reduction in the computational time. This implied that the main computational effort of the DA scheme in the specific implementation was to run the outer model, when using an ensemble size of 100 and updating the ensemble to observations once per minute.
The construction of the perturbation model is crucial for the success of the assimilation scheme and is case-specific.
For our case study, the uncertainty of the simulated water levels at the boundaries of the tunnel model can be formulated in terms of time delay and peak value. To capture these two features, the boundary conditions were perturbed with a time displacement uniformly distributed within the interval [ À τ max , τ max ] and a normal distributed multiplication factor with zero mean and standard deviation σ p . Both parameters were assumed constant in time but varied across locations and ensemble members. This is a rather simplistic approach that ignores the spatial and temporal correlation of the model errors at the tunnel boundaries. However, we find this sufficient for the current study, and our choice of perturbation scheme facilitates the tuning process and allows us to regulate independently the time delay and the peak values of the model boundaries.

Tuning (Event A)
The scheme parameters were individually tuned by trial and error for Event A. In each trial, the error of the updated ensemble at the downstream dropshaft was quantified using the selected prediction metrics (Table 1)  assumed to be the global optimum. Figure 5 (left column) shows the observed, perturbed and updated water levels at the downstream and upstream locations (upper and middle row).

Scheme verification (Events B and C)
The tuned set of parameters was tested with two independent events (B and C) of lower intensity (

Sensor validation
Once the quality of the updated predictions has been verified, they can be used to validate independent observations that are not assimilated in the model ensemble. This was the case for the observed water level at the upstream dropshaft.
To quantify the discrepancy between the observed and expected value at the upstream location, we computed the standardized residual using Equation (15)  Similar to Event B, Event C was driven by a low-intensity precipitation event of long duration, resulting in a partial filling of the tunnel as shown by the downstream observation. This is in agreement with the ensemble prediction upstream, both before and after the assimilation. However, the observation from the upstream sensor suggests that a large volume of water entered the tunnel and filled it up completely. However, while r std mostly stayed within the confidence interval [À2,2] during Events A and B, it raised to above 50 for several hours during Event C, clearly indicating that the sensor was malfunctioning for this period in time ( Figure 5, bottom row).

CONCLUSIONS
We developed a DA scheme tailor-made for system-wide soft sensing and sensor validation for urban drainage  In all three events tested, the updated water levels at the downstream dropshaft scored better when compared to observations than the ensemble predictions without assimilation. The update was also correctly propagated across the tunnel as confirmed by an independent observation at the most upstream dropshaft. Thus, the model generated an accurate, system-wide estimate of the hydraulic conditions. In the third event, the large discrepancy between updated and observed states enabled the automatic detection of a false echo in the upstream sensor. In this case, the updated ensemble was capable of reconstructing the likely water level upstream, acting as a soft sensing tool.
Further research is needed to promote the proposed scheme into a reliable, operational tool, by for example introducing an automatic assessment of both observation and model error.