## Abstract

A significant challenge when attempting to regulate the spatial-temporal concentration of a disinfectant in a water distribution network is the large and uncertain delay between the time that the chemical is injected at the input node and the time that the concentration is measured at the monitoring output nodes. Uncertain time delays are due to varying water flows, which depend mainly on consumer water demands. Existing approaches cannot guarantee that the concentration of the disinfectant will remain within a specified range at the output, even though bounds on time-delay uncertainty may be known. In this work, given bounded water-flow uncertainty, we use the input–output modeling approach to develop a disinfectant scheduling methodology that guarantees a bounded output disinfectant concentration. The proposed methodology creates an input–output model uncertainty characterization by utilizing estimated bounds on water-quality states using the backtracking approach. An optimization problem is formulated and solved to find an input schedule that keeps the disinfectant concentration within predefined bounds for a specified time horizon. Simulation results in two case studies where water demands varied between ±20% of their nominal value show that the proposed scheduler is able to avoid lower bound violations of disinfectant concentration.

## HIGHLIGHTS

A methodology for calculating the disinfection chemical input in water distribution networks is presented that considers the uncertainty in water flows.

The methodology guarantees bounded disinfectant concentration at monitored locations, given bounds on water flows for a time horizon.

Results on an example network show robust performance under high demand uncertainty, in scenarios with abrupt changes, and flow reversals.

## 1. INTRODUCTION

There is currently significant research interest in real-time monitoring and regulation of drinking water quality, especially after a recent EU Directive (European Union 2020), which encourages continuous monitoring of water-quality parameters in water distribution networks (WDN). According to the World Health Organization (WHO 2017), a disinfectant residual needs to be sustained throughout drinking water networks, such that it is sufficient to deactivate waterborne pathogens and, at the same time, small enough to reduce the formation of harmful disinfection byproducts (DBPs), such as trihalomethanes (THMs) and haloacetic acids (HAA) (Mouly *et al.* 2010).

The effective regulation of water quality in WDN is based on the use of mathematical models that facilitate the estimation of disinfectant concentration in the water at different parts of the network (Elsherif *et al.* 2022). This is a challenging task, since water-quality models are inaccurate and complex, especially in large-scale networks (Frankel *et al.* 2023). This is why heuristic algorithms (Li *et al.* 2021) and machine learning models (Sun *et al.* 2019; Li *et al.* 2023) have been used for the detection and source identification of contaminants or other water-quality issues in water networks. A significant challenge when regulating the concentration of a disinfectant in a WDN is the significant delay between the time that a chemical is injected at the input node and the time that the concentration is measured at the monitoring output nodes. Moreover, the time delays are time-varying and uncertain due to varying pipe flows, which depend mainly on consumer demands that are an uncontrolled and typically unknown input to the system (Eliades *et al.* 2023). In addition, the disinfectant reaction dynamics are usually only partially known and the reaction parameters may vary due to the use of water originating from diverse sources or due to fluctuations in temperature (Monteiro *et al.* 2017).

One of the first studies addressing the challenge of large and varying time delays when controlling water quality is by Polycarpou *et al.* (2002), where the proposed input–output (I–O) model of disinfectant concentration is used to adaptively learn periodic parameters to characterize the time delay. The input–output approach can be thought of as a model-reduction technique that overcomes the challenge of using a small number of measurements compared to the system states, to design a water-quality control algorithm (Wang *et al.* 2006). More recently, Wang *et al.* (2021) derived a state-space model of chlorine concentration evolution in WDN, which traces over time all water-quality states; i.e., the concentration at all the links and nodes of the network. To do this, they solve the hydraulic equations to obtain the steady-state flow solution for a hydraulic-step, and then create the state-space equations, which are time-varying due to the varying flows. The drawback of this approach in practice is the lack of observability for estimating the complete state vector, and the need for using model-reduction techniques to calculate a control law. A simulation-based model is used in Xie & Brdys (2015) to compute hydraulics and water quality, while nonlinear model predictive control (MPC) within a hierarchical control framework is then used to control disinfectant levels. The authors in Xie *et al.* (2018) use a multi-input multi-output modeling approach, which considers also the formation of DBPs.

The aforementioned approaches, however, cannot guarantee that the concentration of a chemical will remain within a specified range at the output, even though bounds on water flows uncertainty, and, by extension, the water-quality input–output time-delay uncertainty, may be known. The quantification of water-quality state uncertainty is typically achieved through Monte Carlo Simulations (MCS) of hydraulic scenarios, due to the model complexity in large-scale networks (Hart *et al.* 2019). However, this approach is limited by computational constraints in large systems. Thus, researchers have looked into alternative methods, such as spectral propagation (Braun *et al.* 2020). Recently, an algorithmic approach has been proposed by the authors that guarantee the validity of calculated water-quality bounds, as opposed to MCS (Vrachimis *et al.* 2021). Given these developments, a water-quality regulation approach is needed that is not reliant on MCS to maintain water-quality states within predefined value sets.

In this work, we use the input–output modelling approach to develop a disinfectant injection methodology that guarantees a bounded output disinfectant concentration, given bounds on water-flow uncertainty. The proposed methodology creates an input–output model which considers time-delay uncertainty by utilizing recent results on estimating bounds on water-quality states using the backtracking approach (Vrachimis *et al.* 2021) Following the model predictive scheduling framework, an optimization problem is formulated and solved to find an input schedule that keeps the disinfectant concentration within predefined bounds for a specified time horizon. Simulation results illustrate the effectiveness of this approach in an example network case study.

## 2. PROBLEM FORMULATION

The following notation is followed throughout this work: matrices are denoted with capital letters, vectors with lower bold letters, and scalars with italic letters. Sets and graphs are denoted by calligraphic capital letters. Estimated parameters are denoted with a hat; i.e., denotes the estimated values of the parameter vector . Uncertain parameters are represented by a continuous interval of values defined by a lower and upper bound. Intervals are accompanied by a tilde as follows: , where is the lower bound vector and is the upper bound vector, such that: , and is the size of the vector.

Consider a WDN whose topology is modeled by a directed graph denoted as . Let be the set of all nodes, where is the total number of nodes. These represent junctions of pipes, consumer water demand locations, reservoirs, and tanks. Moreover, let represent the tank nodes, where is the total number of tanks. Finally, let be the set of links, where is the total number of links. These represent network pipes, water pumps, and pipe valves.

The hydraulic state associated with a node at each time instant is the *hydraulic head*, denoted by . The hydraulic state associated with a link is the *water flow*, denoted by . In general, the complete set of hydraulic states can be expressed as . Each node is associated with a water consumer demand at the node location, denoted by . Water demands drive the hydraulic dynamics of a WDN and in this work are considered as an unknown and uncontrolled input to the system. It is noted that some demands may be zero if the node is not associated with any consumers. Background leakages that may exist in the network are modeled as part of the unknown water demands. Pumps and valves are the main hydraulic actuators in a WDN and are modeled at the network links. Their control settings are indicated using the hydraulic input vector .

Specific hydraulic states are measured at regular time intervals using hydraulic sensors installed in the network. The interval is in the range of minutes, and hydraulic analysis is typically performed in discrete time. The measured states may be flow, pressure, or tank level and are indicated by the hydraulic output vector , where indicates the discrete hydraulic time-step associated with time intervals .

Water quality refers to the concentration of various chemical substances and biological species in drinking water that may affect human health. In general, the concentration of these substances in the water should be within predefined bounds, defined by regulating authorities. Disinfectants are the only controllable substance in the water. The existence of a sufficient concentration level of disinfectant in the water indicates that any infectious substances have reacted with the disinfectant and have been neutralized. In this work, water-quality state refers to the concentration of a disinfectant in the water, specifically at the nodes of the network. The water-quality state vector at each time instant is denoted by .

Water quality is regulated using chemical dosing pumps, designed to inject specific amounts of disinfectant into the water at a set of network nodes , where is the total number of controlled inputs. Water quality is monitored using sensors (e.g., chlorine sensors) installed at nodes , where is the total number of quality sensors. Control and measurement signals are transmitted at discrete time-steps . For simplicity, in this work, we assume that is equal to the hydraulic step, i.e., ; to ensure the stability of the solver, the water-quality time step must be sufficiently small compared to the water transport times in the network pipes (Rossman 2000). The water-quality control input vector is denoted by , while the measurement vector of quality states is denoted by .

## 3. STATE ESTIMATION METHODOLOGY

### 3.1. Hydraulic-state estimators

A *hydraulic model* is a set of equations that describe the hydraulic dynamics. These equations are formulated considering the conservation of energy and mass laws (Vrachimis *et al.* 2019). The relationship between the network topology and the physical parameters denoted by the vector (pipe roughness, length, diameter, node elevations, tank dimensions) are also described by the model. Hydraulic parameters are typically time-invariant for the time scales considered in this work. The hydraulic model used in this work is indicated by .

The installed hydraulic sensors measuring flow and pressure in the network typically do not satisfy hydraulic-state observability conditions (Díaz *et al.* 2017). This is because water demands are typically unknown and only an approximate pattern of consumption at nodes is available *a priori*. The approximate consumption patterns are used as pseudo-measurements to achieve hydraulic-state observability. Due to the high uncertainty of pseudo-measurements, an interval of possible demand values for each node is used at each time step instead, indicated by . Moreover, the network’s physical parameters are not exactly known. An estimate of these parameters is used instead, indicated by . The uncertainty of these parameters is considered bounded in this work, thus the range of values they may take is indicated by the interval vector , with .

*hydraulic-state interval estimator*(HSIE) (Vrachimis

*et al.*2019). This is a function that uses the hydraulic model containing parametric uncertainty , and demand uncertainty , to calculate an interval of values for the hydraulic state as follows:

### 3.2. Water-quality estimator considering flow uncertainty

A water-quality model describes the dynamics of species transport and reaction in the water. Transport refers to the movement of substances in water due to the water flow in pipes. Substances can be tracked throughout the network when water flows are known for a number of time-steps, referred to as the *memory* of the system . This is defined as the maximum number of discrete time-steps needed for any input node to affect any output and can be computed given the minimum flow rate of pipes during the system operation.

Many substances, such as various disinfectants, react with other substances in the water and materials attached to pipe walls. Thus, reaction dynamics need to be taken into account when estimating their concentration in the water. Reaction rates are typically unknown due to the multiple species existing in the water and the different characteristics of the water inserted in the network. We assume the availability of bounds on the uncertain estimates of disinfectant reaction rates in the water and with pipe walls. The range of disinfectant reaction rates is indicated by the interval vector . Finally, the water-quality model is indicated by .

*et al.*(2021), which provides bounds on disinfectant concentration at monitored sensor nodes. For convenience, we denote the set of previous time-steps included in the memory of the system as . The BUBA calculates disinfectant concentration bounds , given the vectors of past inputs and the vectors of uncertain water flows , as follows:

## 4. DISINFECTION SCHEDULING METHODOLOGY

### 4.1. Past input effect calculation considering flow uncertainty

*et al.*2009), the reference bounds as follows:

### 4.2. Input–output model of water quality

*et al.*(2002), be written as follows:where is the disinfectant concentration at a monitored node at sampling instant , is the forced disinfectant concentration at the input node, is the number of tanks in the water network, is the maximum water transport delay from input to output node, is the modeling and time-discretization uncertainty. Parameters and are time-varying coefficients corresponding to and , respectively, and depend on the hydraulic dynamics of the network as well as the disinfectant decay rates.

*et al.*2021) can be utilized to calculate the set of delays by which the input affects the output. The following relationship then describes the input–output water-quality model of a WDN without tanks:

### 4.3. Impulse response

*et al.*2006). The IR method calculates, using a simulation model of the hydraulics and/or water-quality dynamics of the water system, the output when and for . The I–O model coefficient is then equal to the calculated output . The complete I–O model of (6) is derived by calculating the coefficients for all . The I–O model can be used to estimate future values of the output as follows:where , and is defined as the Impulse Response matrix for the prediction horizon :An alternative method to calculate the IR matrix is by using the backtracking approach, where a water parcel is backtracked through the network to find the input source (Zierolf

*et al.*1998). Using this approach, the set of paths which bring water into the output node at time-step can also be identified. Path is defined as the set of links that the water parcel, currently arriving at the output, has traversed since it left an input node. Each identified path can be regarded as a separate simulation scenario and can be used to calculate an impulse response, using the same logic as in (8). The impulse response matrix for each path is referred to as a

*path-IR matrix*. Assuming complete mixing of chemicals in the water, the superposition of the path-IR matrices results in the IR matrix of (8):

### 4.4. Impulse response with flow uncertainty

*uncertain backtracking impulse response*(UBIR) algorithm. The algorithm also provides the set of coefficients associated with each delay in :The elements of an uncertain path-IR matrix for a path are then given by:The overall uncertain IR matrix is given by the superposition of the uncertain path-IR matrices:The uncertain IR matrix is essentially a superposition of all the IR matrices , given the considered input time-delay uncertainty. Note that this matrix cannot be used as is for the estimation of the output.

### 4.5. Optimization formulation for disinfection scheduling under flow uncertainty

An optimization formulation is proposed to calculate the control input for a prediction horizon using the uncertain IR matrix . The objective is to find an input that minimizes the deviation of expected output disinfectant concentration from a reference concentration . For simplicity, we assume that the past inputs of the system be zero, thus the reference concentration ; note that this assumption is not limiting because any initial disinfectant concentration decays after a period, specific to each water network, thus only the controlled injection of disinfectants affects the water-quality state. Moreover, the rapid changes of disinfectant injection should be minimized. The objectives should be achieved under the following constraints: (a) The output should not violate the defined lower bound for all possible scenarios given uncertainty bounds on water demands, and (b) The control input should reside within a predefined interval given by .

The first constraint is the key to addressing the uncertain time-delay issue. Essentially it represents the physics of the system using the I–O model, i.e., the impulse response matrix. However, using the uncertain IR matrix in the formulation, the physics of multiple models are considered simultaneously. The use of path-IR matrices in the formulation is essential, in order to distinguish between models that represent a different reality, and models that represent different paths in the same reality.

## 5. CASE STUDIES

### 5.1. Illustrative example

We compare the proposed approach with a scheduling algorithm that uses the nominal model of the network and does not consider flow uncertainty (Munavalli & Kumar 2003). The implementation of the ‘nominal’ scheduling algorithm in this work considers a single-input single-output system. Essentially, it uses the formulation of (13), with the first constraint substituted by an IR model, as in (7), that is derived using the nominal hydraulic model of the network.

*et al.*2016), Monte-Carlo simulations are performed in which the node demands are varied between from their nominal values, generating different flow scenarios. The output is monitored and plotted when the control input calculated by the nominal scheduler is applied to all scenarios (Figure 4(a)), and when the control input from the proposed scheduler is applied (Figure 4(b)).

The results of Figure 4 illustrate the benefits of considering time-delay uncertainty when calculating chlorine injection inputs. When the nominal model is used (Figure 4(a)), we observe the following deficiencies in a number of uncertainty scenarios compared to when the proposed scheduling approach is used (Figure 4(b)):

: (a) there are lower bound violations during abrupt changes; (b) the proposed approach predicts the worst-case delay and avoids violations.

: (a) there are lower bound violations because of the effect of varying time-delay on chlorine decay; (b) there are small violations when using the proposed approach due to the proposed approach not considering modeling and discretization uncertainty .

: (a) There is an overshoot of chlorine concentration in some scenarios due to the effect of path 2 starting earlier in some scenarios; (b) The same effect is observed here since there is no violation of the lower bound.

: (a) There is a significant drop in chlorine concentration and lower bound violation due to flow reversal occurring at link 6 in some scenarios; (b) In contrast, the proposed approach anticipates the flow reversal and keeps the chlorine concentration above the threshold in all considered scenarios.

### 5.2. DMA example

The scalability of the proposed approach is demonstrated using the ‘CY-DMA’ network (illustrated in Figure 5), a DMA of a real network on Cyprus (Vrachimis *et al.* 2020), with pipes, junctions, and 1 reservoir which servers as network water source. The disinfectant input of this network is administered at the inlet node , while the sensor is positioned at the far end of the network, at node . Node demands follow a diurnal pattern. The network has multiple flow paths from input to output, with some links changing direction during simulation. The network is simulated for the duration of h, with a time-step corresponding to min.

*et al.*2021), which considers all flow scenarios generated by uncertain demands, not just the Monte-Carlo scenarios used for testing in this work. Interestingly, this conservatism is not evident in the case study of the example network in Section 5.1, where the impact of varying time-delay on chlorine decay appears to be more prevalent than in the DMA network case study.

## 6. CONCLUSIONS, LIMITATIONS, AND FUTURE WORK

In this work, we have demonstrated a new disinfection scheduling methodology, which considers the time-delay uncertainty when regulating the disinfectant concentration at specific locations of a water distribution network. The results indicate that this approach is able to avoid violations of predefined bounds on disinfectant concentration and regulates the output to follow closely the reference signal, even when the modeling uncertainty on water demands is up to . The use of a modified version of the uncertainty bounding backtracking algorithm proposed in Vrachimis *et al.* (2021) ensures the consideration of all different flow scenarios when calculating the scheduling input, as opposed to using a Monte-Carlo simulation approach that only samples a subset of potential scenarios.

While this work addresses the challenge associated with time-delay uncertainty when regulating disinfectant concentration, it is important to acknowledge certain limitations. Firstly, the methodology focuses on single-input single-output water-quality systems, thus limiting its applicability to networks with multiple disinfectant booster stations and multiple sensors. Additionally, the approach assumes networks without water tanks or those where disinfectant concentrations in tanks are known. Another limitation is the exclusion of uncertainty in decay rates, as our model primarily accounts for time-delay uncertainty arising from fluctuating water flows. Finally, as indicated in the case studies, the proposed scheduling algorithm may exhibit a degree of conservatism in disinfectant concentration regulation, depending on the given uncertainty of water flows.

Future work will develop this methodology to consider multiple inputs and outputs in the network and study its performance and computational efficiency when applied to large-scale water networks. Moreover, a natural extension of this work is the consideration of disinfectant measurements in a feedback loop, to improve the water-quality model and provide better estimations, thus also improving the disinfectant regulation. Finally, the effect of modeling and discretization uncertainty on the performance of the proposed methodology will be analyzed.

## ACKNOWLEDGEMENTS

This work was funded by the European Research Council (ERC) under the ERC Synergy Grant ‘Water-Futures’ (Grant agreement No. 951424) and by the European Union Horizon 2020 programme Grant Agreement No. 739551 (KIOS CoE), and the Government of the Republic of Cyprus through the Deputy Ministry of Research, Innovation and Digital Policy.

## DATA AVAILABILITY STATEMENT

The models, code and data generated during this study are available in https://doi.org/10.5281/zenodo.10226047.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

Proceedings of the 14th International Conference on Computing and Control for the Water Industry (CCWI). The Netherlands, p. 8