ABSTRACT
Real-time and model-predictive control promises to make urban drainage systems (UDS) adaptive, coordinated, and dynamically optimal. Though early implementations are promising, existing control algorithms have drawbacks in computational expense, trust, system-level coordination, and labor cost. Linear feedback control has distinct advantages in computational expense, interpretation, and coordination. However, current methods for building linear feedback controllers require calibrated software models. Here we present an automated method for generating tunable linear feedback controllers that require only system response data. The controller design consists of three main steps: (1) estimating the network connectivity using tools for causal inference, (2) identifying a linear, time-invariant (LTI) dynamical system which approximates the network, and (3) designing and tuning a feedback controller based on the LTI urban drainage system approximation. The flooding safety, erosion prevention, and water treatment performance of the method are evaluated across 190 design storms on a separated sewer model. Strong results suggest that the system knowledge required for generating effective, safe, and tunable controllers for UDS is surprisingly basic. This method allows near-turnkey synthesis of controllers solely from sensor data or reduction of process-based models.
HIGHLIGHTS
Connectivity is automatically identified among water level sensors and valves solely using response data.
Inferred connectivity and response data are used to train linear feedback controllers.
Controllers built purely from data prevent flooding and greatly improve water treatment.
INTRODUCTION
Addressing the joint challenges of urbanization and climate change using the classic approach of passive urban drainage assets is expensive because they are not adaptive, not coordinated (Trotta et al. 1977; Xu et al. 2021), and almost never optimal. Static gray and green assets must be rebuilt to adapt to changes in land use or weather patterns. Further, they are typically designed to meet local or site-scale goals (Pratt 2016). Because broader watershed impacts are not incorporated, this site-level design can actually lead to worse system-level outcomes (Kerkez et al. 2016). Many urban drainage systems (UDS) are designed for idealized or statistically derived events such as the 100-year recurrence interval storm (Pratt 2016; Balbastre-Soldevila et al. 2019; National Weather Service 2023). These designs must also compromise between competing objectives (e.g., retention vs. detention). Since many UDS are statically designed for flood risk reduction in extreme storms, their water treatment performance and flow regulation are always suboptimal (Lund et al. 2018; Brasil et al. 2021; Xu et al. 2021).
Smart UDS enabled through sensing and real-time control promise to make UDS adaptive, coordinated, and dynamically optimal (Bartos et al. 2018; Lund et al. 2018; Xu et al. 2021). Urban drainage assets adapt and coordinate when control algorithms use distributed sensors (e.g., depth, flow) and controllers (e.g., pumps, valves) to dynamically route flows across the watershed. This allows entire systems to be optimally reconfigured for flood prevention and water treatment for each storm without any new construction (Kerkez et al. 2016; Eggimann et al. 2017; Van Der Werf et al. 2022).
Though early implementations of smart UDS are promising, existing control algorithms have drawbacks in computational expense, trust, system-level coordination, and labor cost. Machine learning solutions have been leveraged for real-time control of drainage systems (Kwon & Kim 2021), reinforcement learning being a particularly popular method (Mullapudi et al. 2020; Harris & Liu 2021; Tian et al. 2022; Zhang et al. 2023). However, the computational expense and lack of interpretability of these methods may limit adoption. Heuristic solutions such as equal-filling-degree (Troutman et al. 2020), market-based algorithms (Montestruque & Lemmon 2015), and others (Van der Werf et al. 2023a, 2023b) show improvement over uncontrolled performance and have numerous real-world implementations (Rimer et al. 2023). However, benchmarking studies show heuristic solutions may have lower performance ceilings than controllers which incorporate dynamics and network connectivity (Lund et al. 2018). Rule- and decision tree-based control logics (Shishegar et al. 2019; Shen et al. 2020; Jean et al. 2022) are common in industry, but rely on existing calibrated software models and much labor for their development.
Feedback control (Hespanha 2009) has distinct advantages in computational expense, interpretation, and coordination. As the algorithms depend entirely on linear algebra, computations are cheap and model structure is simple. These models have been used for the analysis and control of many drainage systems, often extending beyond current state feedback to model predictive control (MPC) (Gaborit et al. 2013; Courdent et al. 2015; Ly et al. 2019; Lund et al. 2020; Van der Werf et al. 2023c). As the model is composed of differential equations, model interpretability and explicability are often good. Techniques such as Linear-Quadratic Regulator (LQR) and Bryson’s Rule also allow the definition of intuitive penalties, easing the tuning process (Williams & Lawrence 2020). However, current methods for applying linear feedback control (Wong & Kerkez 2018) often require calibrated software models or significant manual surrogate model development (Loganathan & Bhattacharya 1990), increasing barriers to transferability to new locations. The contribution of this study is an automated method for generating tunable linear feedback controllers that requires only system response data.
For high sensor density or coarse timesteps, conventional methods for system identification (Ho & Kalman 1966; Juang & Pappa 1985; Viberg 1995; Mauroy et al. 2020) could create a linear, time-invariant (LTI) system approximation (Equation (4)) from response data. However, sparse sensors and the desire for frequent control (5–10 min) imply limited observability and delayed causation that prevent the application of existing system identification tools. While it is reasonable to assume that at least the connectivity of a UDS is always known before installing control assets, incomplete installation records and deterioration of pipes in older cities may lead to water routing in different ways than expected (Malek Mohammadi et al. 2019; Delnaz et al. 2023).
METHODS
The controller design consists of three main steps: (1) inferring the network connectivity, (2) identifying a linear, time-invariant (LTI) dynamical system which approximates the network, and (3) designing and tuning a feedback controller based on that LTI system approximation. To contextualize these methods, we introduce the case study first and use it as an illustrative example throughout. Lastly, we describe the experiments used to evaluate the performance and safety of the controllers.
Case study
The controller is implemented in a feedback loop with a high-fidelity software model. We use the pystorms (Rimer et al. 2023) benchmarking library which uses the US Environmental Protection Agency’s Stormwater Management Model (EPA-SWMM). EPA-SWMM is a popular hydrologic-hydraulic software model used for the planning and design of UDS (Mullapudi et al. 2020; Zhang et al. 2023; Lund et al. 2018). Through an iterative numerical scheme, SWMM solves the one-dimensional St Venant equations to model surface runoff and open-channel flow. Hydrology is represented by homogeneous subcatchments which receive precipitation or snowmelt and generate runoff while hydraulics are modeled by routing that run off through channels, storage units, and orifices. Dry-weather sewage flows are generated as flow inputs to junctions or subcatchments. Using a high-fidelity model for simulation rather than the linear system approximation used for making control decisions increases the generalizability to real-world performance.
Pystorms (Rimer et al. 2023) makes programing the training and evaluation relatively simple as states and control actions are exposed through get and set operations. We also used pyswmm (McDonnell et al. 2020) to extract additional metadata (maximum storage depths) from the software model when this was necessary to define the costs of our objective function.
Water drains from left to right in an 11-basin urban drainage network. Basins 1, 4, 6, and 10 are observed and controlled. Circles are basins scaled by size, lines are streams connecting them, and the shaded regions are the corresponding contributing areas.
Water drains from left to right in an 11-basin urban drainage network. Basins 1, 4, 6, and 10 are observed and controlled. Circles are basins scaled by size, lines are streams connecting them, and the shaded regions are the corresponding contributing areas.
Estimating network connectivity from response data

True and inferred influence diagrams. Correctly inferred causative links are shown in green while spurious links are shown in red. Valves are prefixed by ‘V’ while storage basins are plain numbers. Note that dashed arrows do not indicate flows of water, but rather causative influence. For example, in the ‘True’ influence diagram valve V4 causes changes in the water level at storage basins 1 and 4, while none of the valves are affected by any of the water levels as they are control inputs.
True and inferred influence diagrams. Correctly inferred causative links are shown in green while spurious links are shown in red. Valves are prefixed by ‘V’ while storage basins are plain numbers. Note that dashed arrows do not indicate flows of water, but rather causative influence. For example, in the ‘True’ influence diagram valve V4 causes changes in the water level at storage basins 1 and 4, while none of the valves are affected by any of the water levels as they are control inputs.
Constructing influence diagrams from data requires inferring causation, which is an old (Berkeley 1734), well-studied (Freedman 2006), and perennially challenging (Pearl 2009) problem across diverse domains including finance (Brock 2018) and ecology (Sugihara et al. 2012). To develop these influence diagrams from data we need estimates of the strength of each causative link and a method for choosing which links to include. We will describe three existing methods for estimating the likelihood of causative connection which originate in fields including economics (Bressler & Seth 2011), ecology (Sugihara et al. 2012), and information theory (Barnett & Bossomaier 2012). We then describe a greedy graph construction method which uses these likelihoods to estimate the connectivity of the network. N.B. greedy describes an algorithm which makes the locally optimal choice at each step in the execution, rather than considering the global problem or looking ahead to future steps.
Methods for inferring causation
We examine three methods for estimating the likelihood of a causative link existing between two variables in a multivariate timeseries: Granger causality (Bressler & Seth 2011), transfer entropy (Barnett & Bossomaier 2012), and convergent cross-mapping (Sugihara et al. 2012).



































The third method we evaluated is convergent cross-mapping (Tsonis et al. 2018; Sugihara et al. 2012) which is based on Takens’ embedding theorem and rooted in dynamical systems theory rather than statistical inference. Takens’ theorem proves that if one variable within a dynamical system can be observed, some of the properties of the entire system can be inferred by mapping
and lagged versions of itself across multiple dimensions. Essentially, the fully observed dynamical system would have some characteristic shape to its trajectories. Some distorted version of that characteristic shape (precisely, a diffeomorphic shadow manifold) can be constructed by observing part of that system and plotting timeseries against lagged versions of themselves and other variables. Convergent cross-mapping essentially evaluates how much the trajectories of several variables form a coherent phase portrait when projected across various dimensions and delay embeddings. Convergent cross-mapping provides significance (
) and correlation to quantify the likelihood of a causative link existing. We use the Python implementation provided in Javier (2021). In our context, one of the tests is: how coherent of a phase portrait do we get when tracing out the trajectories of (1) the flows out of valve 4 lagged by twenty timesteps on the
-axis against (2) the current depth in basin 1 on the
-axis?
Greedy edge selection for graph completion
We use a greedy edge selection algorithm and the estimated likelihoods of causal connections from the three methods to infer the connectivity of the UDS. A directed graph (e.g., Figure 3) is weakly connected if the corresponding undirected graph obtained by ignoring the direction of the edges is connected. Strong connection in a directed graph implies that one could move between any two nodes by following the directed edges. In Figure 3, all the influence diagrams are weakly connected, while ‘transfer entropy’ is the closest to being strongly connected. In constructing the estimates of network connectivity, we assume that all response data comes from a single urban drainage network which shares some common outfall or other form of linking. This assumption corresponds to a weakly connected graph.
We construct the network connectivity estimates by starting with a graph that has nodes for each timeseries variable and no edges. Edges (causative links) between variables are added by taking the strongest links one at a time (greedily) until the entire graph is weakly connected. For Granger causality and convergent cross-mapping, the strongest links are those with the lowest statistical significance value, . For transfer entropy, the strongest links are those with the greatest values of transfer entropy,
.
Model discovery for urban drainage systems with limited observability












Simulating the LTI system approximations forward given the valve flows as input. Valve flows are the same as in Figure 2. ‘Actual’ is the high-fidelity software model output while the LTI system approximations are identified by the system connectivity estimate they use.
Simulating the LTI system approximations forward given the valve flows as input. Valve flows are the same as in Figure 2. ‘Actual’ is the high-fidelity software model output while the LTI system approximations are identified by the system connectivity estimate they use.
We will explain the method for generating the LTI approximation in reference to the case study system in Figure 1. The depths in the storage basins are , and
while the flows through controllable valves are
, and
. Rain is not observed or modeled. This is a common choice in designing real-time controllers and may increase robustness by eliminating vulnerability to inaccurate forecasts (Wong & Kerkez 2018; Troutman et al. 2020; Rimer et al. 2023; Van der Werf et al. 2023c).























The algorithm generates an optimal representation of the system in this form by a double optimization loop. To maintain a reasonable length of this manuscript, we briefly summarize the algorithm here. For more details, please see Dantzer & Kerkez (2023) and Dantzer & Kerkez (2024). The inner loop uses regression (de Silva et al. 2020) to estimate the coefficients () of each equation to best fit the derivative. The outer loop changes the shapes of the transformations
to achieve better accuracy in the regression of the inner loop. The transformations
are unique to each input–output pair and not shared across the system. That is, the
in the equation for
is not the same as the
in the equation for
.












Having identified the LTI system approximation, we now design and tune a feedback controller based on that approximation.
Linear-quadratic regulation (LQR) of UDS with precompensation
Once the linear representation of the urban drainage system is constructed, we use LQR controller design (Hespanha 2009; Williams & Lawrence 2020) to determine the valve opening percentages. This approach uses the desired system states (10% full) and the current states to calculate the optimal command at each valve or weir. The desired system state is a tuning parameter, here specified as 10% full to completely capture small storms without significantly enhancing flood risk.
































State estimation and feedback loop
Closing the feedback loop requires the definition of an estimator as the augmentation of the state described in Section 2.3 means that not all states in the system approximation are observable (because they are inferred). Therefore, we need to update these unobservable states () using the measurements (
) which are available. This is achieved by a linear-quadratic estimator (LQE) (Hespanha 2009) whose design is dual to that of the linear-quadratic regulator. The cost function is similar to Equation (10) but relies on an estimate of the relative magnitude of system approximation error and sensor noise rather than output or control penalties. LQEs (also known as Kalman filters) have good rejection of random noise, asymptotically converge to zero-error state estimates, and have seen some application in urban drainage systems already (Bartos & Kerkez 2021; Baumann et al. 2022; Oh & Bartos 2023). The matrix estimator gain is denoted as
.


















The software model accepts the valve open percentage commands and simulates forward 5 min. After 5 min have elapsed, the software model gives its current measurements to the estimator and the process begins again at Equation (16).
Evaluation scenarios and metrics
The connectivity inference methods are evaluated by their sensitivity and specificity. Sensitivity is the number of true positives divided by the number of true positives and false negatives. In this context, sensitivity measures the detection of links which are present. Specificity is the number of true negatives divided by the number of true negatives and false positives. In this way, specificity measures the rejection of links which are not present. In this scenario, a true positive indicates that a connection exists and is correctly identified. A false positive means the method inferred a connection which does not exist (e.g., Basin 10 to Basin 6 in Granger, Figure 3). A false negative means the method missed a real connection. Last, a true negative indicates the method correctly omitted a connection which was not present.
Five control cases (uncontrolled, true connectivity, Granger causality, CCM, and transfer entropy) are evaluated on the case study system (Figure 1) for 190 statistically derived design storms (Pratt 2016; Balbastre-Soldevila et al. 2019; National Weather Service 2023) varying in length from 5 min to 60 days and recurrence intervals from 1 year to 1,000 years. As Michigan’s climate is humid these events are large, with the 24-h, 1,000 year storm delivering nearly 20 cm of rain. The average total annual precipitation (rain and snow water equivalent) for the study location is 97 cm. The length of the simulation is scaled to the length of the storm such that the 5-min through 2-h storms are simulated for 1 week, the 3-h through 7-day storms are simulated for a month, and any storms longer than a week are simulated for two and a half months.
We examine three performance criteria aimed at quantifying flooding safety (maximum filling degree), as well as water quality and ecological outcomes (flow threshold exceedance and sediment export).
Data and software availability
All data and software to replicate the experiments described in this work as well as additional figures are freely provided at https://github.com/dantzert/blind_swmm_controller_gamma.
RESULTS AND DISCUSSION
We examine results evaluating each step of the method: network connectivity inference, system approximation fidelity, and controller performance.
Network connectivity inference
Granger causality (Figure 3, Table 1) best matched the true influence diagram as measured by sensitivity and specificity. Transfer entropy also included all seven true links, but included more false positives (12) than Granger (6). Convergent cross-mapping only included six false positives, but missed the links between valve 4 and storage basin 1 and between valve 10 and storage basin 4. This means CCM used decentralized control on storage basins 1 and 10, as inflows were not coordinated with outflows. Observing a longer or more complex characterization experiment may yield more accurate estimations of the network connectivity. The simulation fidelity results (Figure 4) suggest that accuracy ratios as presented in Table 1 are not sufficient to predict which models will best approximate the system. It seems that some of the false links included in the Granger connectivity are particularly misleading relative to the false links included in the other models. That is, which links are incorrectly inferred may be more important than how many.
Connectivity inference accuracy of the three methods examined
. | Granger . | Transfer entropy . | CCM . |
---|---|---|---|
Sensitivity | 7/7 | 7/7 | 5/7 |
Specificity | 15/21 | 9/21 | 15/21 |
. | Granger . | Transfer entropy . | CCM . |
---|---|---|---|
Sensitivity | 7/7 | 7/7 | 5/7 |
Specificity | 15/21 | 9/21 | 15/21 |
Summary of flood prevention and treatment performance. Storm durations are indicated on the axes while recurrence intervals are shown on the
axes. The first row indicates maximum filling degree (Equation (19)) at any asset over the course of a storm, where 1.0 is the maximum value (black contour) and indicates flooding at one or more assets while the white contour indicates an asset being half full. The second row shows flow threshold exceedance (Equation (20)). The third row indicates the mass of total suspended solids leaving the catchment through valve 1 (Equation (21)). The white and black contours are 0.1 and 5 kg of total suspended solids, respectively.
Summary of flood prevention and treatment performance. Storm durations are indicated on the axes while recurrence intervals are shown on the
axes. The first row indicates maximum filling degree (Equation (19)) at any asset over the course of a storm, where 1.0 is the maximum value (black contour) and indicates flooding at one or more assets while the white contour indicates an asset being half full. The second row shows flow threshold exceedance (Equation (20)). The third row indicates the mass of total suspended solids leaving the catchment through valve 1 (Equation (21)). The white and black contours are 0.1 and 5 kg of total suspended solids, respectively.
System approximation fidelity
Figure 4 shows the four LTI system approximations simulated forward over the characterization experiment given the valve flows as input. Because Granger causality best identified the connectivity of the network, we expected the system approximation derived from that connectivity to best replicate the dynamics of the system. However, it learns a model which quickly diverges from reality for basin 4. Transfer entropy and CCM remain visually close to the high-fidelity model for over a week at all assets. The LTI approximation built using the true connectivity of the system closely replicates the dynamics of the high-fidelity software model using three matrices (Equation (8)). For the true connectivity system approximation, much of the error at basins 6 and 10 is because the surface area of the storage asset changes with depth () and the LTI model does not have a way to account for this nonlinearity. Results are better at basins 1 and 4 because the surface area varies less with storage depth.
The richness of the characterization experiment in this case study is primarily serving the connectivity inference algorithms. When the connectivity was already known, two 1-h long releases (24 5-min timesteps) from each storage basin were sufficient to generate an LTI system approximation accurate enough for safe feedback control. The precise lower bound of data required for training models is an open question we expect to depend considerably on domain expertise. Theoretically, the immediate impact of a valve’s flow () on the depth of the basin it is attached to (
) is captured by a single parameter,
and so could therefore be estimated by observation of two timesteps as
. Speaking to scalability, large data records can also be used to train as the LTI system generation algorithm is quite efficient. Apart from control applications, these results also suggest the applicability of the method for generating computationally efficient surrogates of more detailed, process-based UDS models.
Performance across design storms
The response to the 24-h, 100-year storm (12.9 cm total depth) is shown for the five control conditions as it is the local stormwater design standard (Pratt 2016).
The response to the 24-h, 100-year storm (12.9 cm total depth) is shown for the five control conditions as it is the local stormwater design standard (Pratt 2016).
The controllers based on Granger- and transfer entropy-inferred connectivities have similar strategies (Figure 6) to the true-connectivity controller at storage basin 1, while CCM displays more flood aversion. Responses are similar at 4, except for granger which has momentary flooding. CCM is the only inferred controller that correctly models that storage basin 6 is oversized relative to its contributing area and so should not release any flows until after the other assets have drained. Note that even the most constraint-violating behavior of the inferred controllers does a far better job of respecting the erosion prevention flow constraints than the uncontrolled case (Figure 5).
Controller performance is predictable from system approximation fidelity, but (surprisingly) not from connectivity inference accuracy. Granger has the worst system approximation fidelity and the worst controller performance. CCM and Transfer entropy have better approximation fidelity and better performance. Though how they will fail is more difficult to predict, there is some indication (Figure 4) that Granger has a particularly poor model of the depth in basin 4 which might make flooding more likely at that asset. Though caution is always necessary when deploying new control assets, there are numerous results proving the safety and predictability of LTI controllers if the system approximation has sufficient fidelity (Hespanha 2009; Williams & Lawrence 2020). That is, given sufficient system approximation fidelity, controllers designed according to formal techniques such as linear-quadratic regulator have safety and robustness guarantees without direct analogy in less analytical approaches such as reinforcement learning.
We conclude the discussion with a brief guide on how these (freely shared) tools can be applied to novel data sets and systems. First, if the connectivity is already known, the connectivity inference should be run to check this assumption. If the initially assumed connectivity is verified by the connectivity inference results, it is only necessary to train a single system approximation based on that verified connectivity. If the initially assumed connectivity is not reflected in the inference results, re-examining these assumptions and further investigating the system may be advisable. If the connectivity is unknown, compare the fidelity of all three system approximations using domain-relevant error metrics. Which system approximation is best or if any are sufficiently accurate to apply feedback control is best judged using domain expertise.
Once the user has judged one of the system approximations to have sufficient fidelity for feedback control, tuning has an intuitive starting point and proceeds iteratively. As in Section 2.4, the initial cost matrices and
can be fully specified by determining maximum allowable values for the observable states
and the control inputs
. In this case study,
had maximum allowable depths and
had maximum allowable flows. However, the states and inputs could generally be any measurable or controllable variable in the system. With some algebraic manipulation allowable ranges could also be specified rather than maximum values. After responses to this initial tuning are observed and critiqued, LQR design offers intuitive control of the system’s dynamics as noted in Section 2.4. For example, if after observing Figure 6 it is desired to retain more water in basin 10, the penalty on depth in basin 10 (
) could be decreased in Q or the desired depth in basin 10 (
) could be increased with slightly different effects on the dynamics. If more complex control of the system response is desired there are a multitude of control algorithms applicable once the linear, time-invariant system approximation has been defined. For example, integral feedback control can achieve steady state tracking of a desired state despite modeling error while loopshaping allows designers to manipulate the frequency-domain response characteristics of their systems (Hespanha 2009; Williams & Lawrence 2020).
CONCLUSIONS
These results suggest that the system knowledge required for automatically generating tunable controllers of urban drainage systems may be surprisingly basic. The most informed system approximation (‘True’ in Figures 5 and 6) receives an influence diagram of how assets are connected and data from a short characterization experiment. This basic information is used to nearly perfectly capture the system behavior in three matrices (Figure 4).
This study demonstrated the automatic generation of tunable linear feedback controllers using only response data. This allows near-turnkey synthesis of controllers solely from sensor data or reduction of process-based models. Future work should examine how to improve the network connectivity estimation algorithms in the context of urban drainage, investigate how computational expense scales with larger systems, and explore more sophisticated state-space control tools such as integral feedback or disturbance rejection.
DATA AVAILABILITY STATEMENT
All data and software necessary to reproduce the study is freely provided at the link indicated in Section 2.7 Data and software availability.
CONFLICT OF INTEREST
The authors declare no conflicts of interest.