## Abstract

Monitoring of sewer networks (SNs) is an important task whose planning can be related to various purposes, for example contaminant detection and epidemiological studies. This paper proposes two different approaches for the identification of a monitoring system in SNs. The first one proposes the identification of the best monitoring points starting from the knowledge of the hydraulic behavior of the system with respect to specific sensor threshold values through an optimization procedure that maximizes the reliability in detecting a contaminant. A new mathematical model is developed and a global optimization solver is employed to perform the optimization procedure. The second approach is based on the complex network theory (CNT) tools, adopting the in-relevance-based harmonic centrality, and does not require any hydraulic simulation. The metric is evaluated for each node of the network and provides a range of nodes, classified with respect to their importance, useful to identify suitable locations for sensors. With reference to both a benchmark and a real SN, the comparison between the results achieved by both strategies indicates that the two approaches provide comparable solutions in terms of sensor location.

## HIGHLIGHTS

Comparison between two different approaches for monitoring planning in sewer networks.

Use of a global optimization solver to find the optimal location of sensors maximizing the network reliability.

Use of the in-based harmonic centrality as a tool to identify suitable locations for sensors referring to only topological information.

Despite being based on different concepts, the two approaches provide comparable results.

## INTRODUCTION

Sewer networks (SNs) are complex systems aimed at the collection and transport of wastewater (sanitary systems) or wastewater and rainwater (combined systems) up to the treatment plant (Simone *et al.* 2022b).

In recent decades, the need to propose new monitoring strategies for such complex systems has become increasingly urgent, mainly for factors related to the identification of illicit intrusions, as well as to the control of specific contaminants and to the limitation of the potential environmental impact (Gromaire *et al.* 2001). More recently, SNs monitoring has been proposed to support epidemiological studies (Larsen & Wigginton 2020).

The sensor location problem solved with optimization algorithms has been the focus of attention of many researchers, mainly relying on heuristic approaches. Banik *et al.* (2015a, 2015b) proposed a source identification methodology, based on a pre-screening aimed at reducing the network size, for identifying the set of candidate nodes for sensors. Banik *et al.* (2017a, 2017b) proposed optimization-based strategies aimed at the optimal location of sensors in sewer systems. Tinelli *et al.* (2017) investigated the optimal location of sensors with evolutionary algorithms, selecting the contamination events based on information about the network in terms of topology and operation. The problem of the optimal location of sensors was also addressed by Yazdi (2018), who proposed the use of a new procedure relying on evolutionary algorithms and entropy theory. Sambito *et al.* (2020) developed an approach for positioning water quality sensors based on the Bayesian decision network in order to facilitate the early isolation of illicit intrusions, focusing on soluble conservative pollutants, such as metals. The approach identifies the optimal sensor location gaining advantage from additional information, such as topology, thus reducing the computational effort needed to obtain the solution.

The monitoring of pollutants and pathogens in sewers has also been tackled with approaches based on backtracking algorithms, which represent a technique for evaluating the presence of contaminants and their diffusion process through a backward process, from downstream to upstream. This process is aimed at supporting the identification of candidate positions to host monitoring measures. Rodríguez-Alarcón & Lozano (2019) proposed an approach based on complex network theory (CNT) tools and a backtracking algorithm to evaluate potential spill or contaminant release with respect to all nodes of the system. Simone *et al.* (2022a) proposed a strategy based on network topology and a backtracking algorithm to model the diffusion of pollutants along the Urban Drainage Network (UDN), by using only information derived from Horton's hierarchy. Chachula *et al.* (2021) proposed a backtracking strategy by using the time-series of sensor measurements in an array of localization of contamination events providing good information on contaminant events with few sensors. Guadagno *et al.* (2023) proposed a backtracking methodology, based on the calculation of the impact coefficient related to the dilution and decay of contaminants in SNs, to locate sensors with only one steady hydraulic simulation.

Most of the presented approaches are very effective but also very cumbersome, both for the hydraulic simulations they require and for the expensive calculation times. Furthermore, it could also happen, particularly with very large and complex systems, that the lack of information (e.g., flow) makes such analyses difficult or even unreliable. In this context, the CNT is proposed as a useful approach for the analysis of complex real systems, which is gaining momentum. With reference to SNs, various works have been proposed to evaluate the vulnerability, resilience, and operability of such systems (Reyes-Silva *et al.* 2020; Hesarkazzazi *et al.* 2022; Simone 2023) but also aspects related to monitoring. Simone *et al.* (2022b) proposed the use of CNT-tailored centrality metrics in the analysis of sewers for vulnerability/resilience assessment, optimal monitoring design and spread of contaminants. They highlighted the importance of topology in the study of sewer systems, showing how their behavior is the result of the interaction between the role of nodes and network topology. Zuluaga *et al.* (2020) used the network theory together with differential equations to model and simulate water quality parameters in a hydrological network. García-Usuga *et al.* (2020) used PageRank's centrality to identify well-monitored nodes in UDNs more susceptible to contamination. Halverson & Fleming (2015) proposed the use of CNT tools for systems of streamflow gauges. Their goal was to evaluate whether this approach could be meaningful when applied to hydrometric data, and, more specifically, whether it may help to guide decisions in stream gauge placement. Results showed that the betweenness metric is effective in identifying key points for sensors' placements, especially in bridges between communities.

Considering the potential of the CNT tools for the analysis of real systems and the performance of optimization algorithms, the aim of this paper is to model the sensor placement problem, i.e., determining the candidate positions to host monitor sensors for detecting the presence of contaminants or pathogens in the system, using both an optimization procedure and a complex network theory approach (Simone *et al.* 2023).

The obtained optimal solutions, with respect to a specific objective are compared with CNT results to evaluate the effectiveness of the topological approach. In fact, obtaining promising comparison would mean validating the topological approach for several preliminary and complementary applications relative to the analysis and management of SNs, while reducing the computation effort.

The first strategy is based on an optimization procedure aimed at searching for the best location of a fixed number of water quality sensors, in order to maximize the reliability of the whole system to detect a target substance. This strategy relies on the development of a mathematical model, which is solved using a global optimization solver. The proposed optimization procedure allows one to obtain the global optimum of the problem, depending on the accuracy of the mathematical model.

The second one uses a CNT centrality metric tailored accounting for both information on the connectivity structure and on the intrinsic relevance of nodes of the system (Giustolisi *et al.* 2020; Simone *et al.* 2020). The strategy considers the different roles of nodes (e.g., inlet nodes, connection nodes, and outfall nodes) embedding the information about their intrinsic relevance as inflows and the presence of spatial constraints (e.g., slope). The in-relevance-based harmonic centrality is computed for all nodes of the direct graph of the SN to support the monitoring system planning.

The main goal of the paper is to understand whether the simpler and less computationally expensive CNT-based approach furnishes comparable results in terms of sensor positioning with respect to the optimization solver.

It is important to highlight that, despite providing very promising results with short computational effort, the topological approach requires a deep knowledge of the used tools, in order to be able to implement the procedure in the most effective way and to guarantee a reliable interpretation of the results, as well as a full competence of the hydraulic systems in order to validate the consistency of the achieved results.

The paper is organized as follows. The next section reports a description of the two methodologies, the third section presents and compares the results of the analyses applied both to a benchmark SN and to a real one. Concluding remarks are drawn in the last section.

## METHODS

Determining the suitability of a node for the detection of a contaminant in SNs is essential for planning an efficient monitoring system. This paper presents two modeling approaches for optimal sensor placement aimed at detecting target substances in SNs.

The first methodology proposes an optimization problem based on one single objective function, that is, the maximization of the network reliability when a spill of contaminant occurs within the system.

The second methodology investigates the sensor monitoring problem from a more topological perspective, using a metric proposed by the CNT and adapted to infrastructural systems like SNs.

### The optimization methodology

The first strategy is deterministic optimization, relying on the analytical properties of the problem to generate a sequence of points converging to a global optimum or an approximately global optimum. Deterministic methods include linear programming (LP), non-linear programming (NLP), mixed-integer linear programming (MILP), and mixed-integer non-linear programming (MINLP) (Belotti *et al.* 2013; Morani *et al.* 2023). Most of the available MILP/MINLP solvers can achieve global optima only in convex problems (Belotti *et al.* 2009). The only solvers managing to find global optima in both convex and non-convex problems are the global optimization solvers.

In this study, the optimization procedure is performed by the SCIP (Solving Constraint Integer Programs) (Vigerske & Gleixner 2018) solver, which is a global optimization solver implementing a spatial branch and bound and various heuristics.

Given a network consisting of *L* links and *N* nodes, the aim of the optimization procedure is to find the best location of a fixed number of sensors to maximize the network reliability () in case of pollutant introduction. In this study, the optimization procedure has been decoupled by the hydraulic and quality modelling of the network, thus flow velocity () and pollutant concentration () are preliminarily computed by means of the external software SWMM (Rossman 2017).

The network can be modelled as a directed graph so that each link *l* (*l* = 1, … , *L*) has a proper direction and the discharge flows from the initial node (*i*) to the final node (*f*). In this study, each node of the network is a possible candidate for the installation of a sensor, and it has been modelled by means of a binary variable (), which is equal to 1 if the device is installed, and 0 otherwise. According to the proposed procedure, the sensors should be located in order to maximize the reliability of the network, which means maximizing the number of detected points when a pollutant is inserted within the network. In this study, the pollutant entrance is assumed in only one node at a time and the network reliability is assessed over several scenarios, differing by the node assumed as an inlet point.

*i*(

*i*= 1, … ,

*N*) and for each scenario

*s*(

*s*= 1, …

*σ*). Such a binary variable has been properly modelled by means of a set of constraints, written for each link

*l*(

*l*= 1, … ,

*L*) of the network:where is a parameter equal to 1 if the concentration at the

*i*th node is greater than a threshold concentration (), 0 otherwise. The parameter results from the concentration values obtained by the quality simulation performed by the software SWMM.

*ω*) is not constrained by Equation (1), since it cannot be the initial node of any links. In order to properly define the pollutant detection at the outfall node, the following constraint has been defined:

According to Equation (2), the pollutant at the outfall is detected only in case of device installation (i.e., equal to 1) and concentration value greater than the threshold value (i.e., equal to 1).

### In-relevance-harmonic based methodology

Centrality is one of the most studied concepts in CNT. Several centrality metrics have been proposed (Freeman 1978; Borgatti 2005; Newman 2010) to evaluate the most central element in real systems with respect to both different physical phenomena and the way information flows in the network. The proposed approach recalls the concept of harmonic centrality and investigates the possibility of using this metric to efficiently plan a quality monitoring system in SNs. Its application is justified because the harmonic centrality assumes that information through the network moves only along the shortest possible paths, just as happens in SNs, for which the shortest paths between the various pairs of nodes are uniquely determined by the slope of the system.

*d*is the distance from node

_{ij}*i*to node

*j*in the network.

*f*(

*R*,

_{i}*R*), obtaining the relevance-harmonic centrality (Giustolisi

_{j}*et al.*2020) mathematically expressed as:

The intrinsic relevance of the ending nodes, *R _{i}* and

*R*, of each link, is a piece of information depending on the type of network and analysis to perform.

_{j}The strategy here proposed is performed using the function *f* (*R _{i}*,

*R*) = (

_{j}*R*

_{i}*R*), corresponding to the product of the intrinsic relevance of the ending nodes

_{j}*i*and

*j*of each pipe

*l*. It is considered as the one that best highlights the role of intrinsic relevance in the connections between nodes for the specific considered problem. The intrinsic relevance of nodes is assumed equal to the inflow for each node, proportional to the quantity of contaminant introduced.

It is important to note that, since the SNs are direct networks, it is necessary to use a specific version of the harmonic centrality, i.e., the in-harmonic (Simone *et al.* 2023), which, in the specific case, corresponds to the one that collects the information entering the nodes.

Therefore, the in-relevance harmonic centrality is evaluated for each node of the systems considered and the analysis provided a node importance ranking, normalized in the range [0, 100]. Nodes with higher values of the metric represent the points where most of the information disseminated on the network is concentrated and are, most likely, the best candidate to host sensors/sampling points.

## RESULTS AND DISCUSSION

### Benchmark SN

*et al.*2023) is considered to perform the two approaches. The network model is composed of 77 nodes (manholes), 79 links (sewer pipes), and 1 outfall. The flow directions are imposed by the slope of the pipes.

#### Application of the optimization methodology

For the analysed case study, each node of the network has been assumed as the possible inlet of the pollutant. Therefore, the total number of analysed scenarios (*σ*) is equal to the total number of nodes (*N*). Regarding the inserted pollutant, the inflow concentration has been assumed equal to 1 mg/l. The pollutant has been considered to decay with a first-order kinetic, with the decay coefficient set as 0.2 h^{−1}, consistent with the study of Hart & Halden (2020) targeting SARS-COVID-19 as a contaminant.

Once a threshold concentration value () and a number of installed sensors () are fixed, the optimization process is accomplished by the solver SCIP in less than 1 s. The achievement of the global optimum is extremely fast since SCIP is a very high-performance solver for MILP problems.

#### In-relevance-harmonic-based methodology

Therefore, the analysis suggests placing two sensors in nodes 78 and 59. As the number of sensors to be positioned increases, the ranking of metric values allows for immediately identifying the most suitable position to host them. According to such an interpretation of the metric, nodes 50 and 27 follow as candidate nodes for hosting sensors.

The analysis provides null values of the metric for all extremal/ head nodes because no information arrives at these points. This result implies that these points are not suitable candidates for hosting measurements.

The results of the analysis show that the in-relevance-harmonic centrality is effective in evaluating the ability of the nodes to receive contaminants and therefore it could be effective in the study related to the spread of contaminants and the planning of monitoring systems.

#### Optimization procedure vs. in-relevance-harmonic centrality

Comparing the results of the proposed methodologies, the nodes identified through the in-relevance-harmonic centrality (Figure 4) are very close to the ones individuated by the optimization method (Figure 3), by considering a threshold value of *C*_{0} = 0.008 mg/l. Both approaches indicate, indeed, that nodes 78, 59 and 50 as the best candidates for either the positioning of sensors or the collection of samples to detect target pollutants, showing that, despite being based on different concepts and approaches, the two strategies provided quite comparable results.

Although this result confirms the validity of the topological approach in addressing the study of SNs, it is important to underline that such analysis requires technicians who are able to implement and adequately interpret the CNT tools with respect to the SNs operation, as well as to evaluate the feasibility of the results with respect to their hydraulic behavior.

### Real SN

*et al.*(2022), and redrafted to consider dry weather conditions setting the lateral inflows and excluding the sub-catchments. The network model is composed of 531 nodes, 530 pipes, and 1 outfall. The flow directions are imposed by the slope of the pipes (Figure 5).

#### Application of the optimization methodology

As for the benchmark network, each node of this real network has been assumed as a possible inlet of the pollutant, with an input concentration equal to 1 mg/l. The pollutant has been considered to decay with a first-order kinetic with a decay coefficient set as 0.2 h^{−1}. Despite its larger size compared to the benchmark network, once the threshold concentration value () and the number of installed sensors () are fixed, the optimization process is performed in a fraction of a second.

#### In-relevance-harmonic-based methodology

The maximum value of the metric is at the outfall, i.e., node 531, which makes it, as expected, the most suitable node to host a sensor. The ranking of Figure 8 is indicative of the ability of each node to collect information derived from other nodes in the system. In particular, it is possible to note that the metric identifies several main paths in the network, characterized by the presence of nodes with high intrinsic relevance (yellow nodes), corresponding to endpoints of separate branches, and in any case, connected to other well-connected nodes. Obviously, each node can receive information only from the nodes adjacent to it, and the ability to collect information from some nodes is greater than that of the others due to their topological relevance. Most of the nodes with high values of the metric are located near the only outfall, along the main path directed towards this point. A second important path intersects the main one close to the outfall (in the red node), while another path of medium importance is located in the northwest area of the network. The remaining part of the nodes takes on very low or even zero metric values.

The selection of the nodes to be indicated as candidate positions for the installation of sensors should consider metric values in the range [20, 100]. This criterion leads to the identification of seven nodes as candidate positions for sensors, obviously belonging to the main routes mentioned before.

Overall, such a metric trend allows for tracing the detection of contaminants in the network and highlighting the points that are most characterized by this process.

#### Optimization procedure vs. in-relevance-harmonic centrality

The comparison of the results obtained by the in-relevance-harmonic centrality and the optimization procedure performed assuming a threshold concentration equal to 0.008 mg/l highlighted that the two procedures provide comparable results since the points selected for either sensor installation or sampling are very close. In fact, the nodes contained in the range chosen for the topological metric enclose the ones provided by the optimization or indicate adjacent positions. In both cases, the candidate nodes to host the sensors are located along the same path.

However, it is worth noting that the solution achieved by the optimization procedure varies depending on the threshold concentration. Indeed, assuming worse quality sensors (i.e., increasing the threshold concentration), the number of installed sensors ensuring the full coverage of the network increases accordingly. The solution provided by the topological metric, instead, remains unchanged, being only based on the network topology.

Results highlight that the nodes selected by the optimization solver still lie within the range of nodes with high metric values. Particularly, considering the optimization performed using a threshold value equal to 0.003 mg/l (Figure 9(a)), the analysis suggests installing only four sensors, specifically in the neighborhood where the topological approach provides higher values of the metric (Figure 8), which corresponds to the first sensors selected in the case 0.008 mg/l (Figure 7). Again, considering the optimization performed using a threshold value equal to 0.01 mg/l, the suggested locations of the additional sensors are reported in Figure 9(b). The position of the new devices defines a further path of monitoring in the network, just as occurs for the topological metric (Figure 8). They still include the eight locations selected in the test = 0.008 mg/l (Figure 7), demonstrating good robustness of the optimal solution with respect to the threshold value.

Overall, it is evident that the analysis with the topological metric cannot be as exhaustive and complete as the results provided by the optimization for the various threshold values, and this is not the objective. However, the single solution provided by the topological metric encloses important information that allows us to proceed progressively to an efficient position of sensors in the network without a more complex optimization method application. This result is surprising considering that the used metric is applicable from the first stages of the study, requiring only data relating to the system topology without performing any simulation.

As already explained, the topological approach provides very promising results using only information on the system topology. However, it is worth underlining the importance of knowledge requirements for technicians implementing the topological approach, who need to have full competence in the hydraulic operation of the SNs, in order to best implement the procedure and evaluate the reliability of the results coherently with the behavior of the system. Conversely, the deterministic approach allows for obtaining the global optimum of the problem once a mathematical model is developed and the hydraulic behavior of the system is known. The assessment of the found optimal in terms of quality cannot be made when the optimization is performed by means of heuristic procedures, in which the relation between input parameters and output variables is not conclusively determined. However, with reference to deterministic optimization, the achieved optimum is global with respect to the search space associated with the developed mathematical model, i.e., the effectiveness of the optimum depends on the accuracy of the mathematical model and its efficiency in interpreting the behavior of the considered system.

Finally, several advantages may result from coupling the proposed approaches in one single procedure, where the topological approach first provides the user with a preliminary selection of the most suitable nodes for sensor location, among which the deterministic optimization searches for the best solution based on both hydraulic and mathematical constraints. Indeed, in large-size problems (i.e., large SNs), the implementation of a deterministic optimization may require high computational effort, thus the integration of both approaches in a two-step procedure may allow for a faster convergence of the solver without affecting the quality of the found solution.

## CONCLUSIONS

The present paper addresses the problem of the optimal location of sensors within SNs based on two different approaches. The first approach relies on deterministic optimization aiming at finding the best location and number of sensors to maximize the reliability of the network for the detection of a target substance. The second approach uses a relevance-based CNT centrality metric, which does not require any simulation, but only the knowledge of the topological scheme and the flow inputs. In particular, the in-relevance-harmonic centrality, which evaluates the ability of each node to disseminate and receive information, is used to sort the nodes of the network based on their importance. The two approaches were first tested on a benchmark network, and then a real SN was assumed as a case study. According to the results, the two approaches provide comparable solutions in terms of sensor location within the networks. Indeed, the nodes selected by the optimization procedure lie within the range of the nodes with the highest values of the topological metric.

Considering the promising results achieved by the topological approach, the proposed metric can be used as a very efficient complementary tool for the design of SN monitoring systems in complex schemes.

Indeed, when large-size networks are considered, the optimization procedure can be significantly demanding and a preliminary selection of the most suitable nodes for sensor location can be crucial to tackling the computational complexities affecting the problem. Therefore, an integrated procedure could be investigated in future studies, based on the use of the CNT centrality to reduce the research space and speed up the optimization procedure.

## ACKNOWLEDGEMENTS

The research has been performed under the Project MIMOSAE – FRA2020 financed by University of Naples Federico II.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Application of Social Network Analysis (ASNA)*

*Storm Water Management Model Reference Manual Volume II — Hydraulics*. EPA/600/R-17/111. USEPA, Cincinnati, OH.

*River*,

*December*2022, 39–51. https://doi.org/10.1002/rvr2.30.

*The Spread of Contaminants in Urban Drainage Networks Based on a Topological Analysis*. 20. https://doi.org/10.3390/environsciproc2022021020